Measures of adult general functional status: SF-36 Physical Functioning Subscale (PF-10), Health...

11
Measures of Adult General Functional Status SF-36 Physical Functioning Subscale (PF-10), Health Assessment Questionnaire (HAQ), Modified Health Assessment Questionnaire (MHAQ), Katz Index of Independence in Activities of Daily Living, Functional Independence Measure (FIM), and Osteoarthritis- Function-Computer Adaptive Test (OA-Function-CAT) DANIEL K. WHITE, JESSICA C. WILSON, AND JULIE J. KEYSOR INTRODUCTION Self-reported measures to assess and quantify functional status are important tools for clinicians and investigators. These measures qualify limitation with different types of functional activities and quantify the extent of limitation. We are particularly interested in measures of general func- tional status. Although these “generic” measures of func- tion were originally developed in other patient popula- tions, they are relevant to the field of rheumatology. In particular, these instruments have been found to be valid and reliable measures of function, sensitive to changes in function, and have distinct thresholds for important change in people with rheumatologic disease. Some notable studies have been added to the literature for general functional status measures in the last decade. Most of these additions are in the area of identifying thresholds for minimum clinically important difference, i.e., the smallest amount of change associated with a min- imally important decline or improvement in function. To reflect changes in clinical practice over the last decade, we chose to review the Functional Independence Measure, which is a commonly used measure in practice to assess function. Also, Computer Adaptive Testing has developed over the past decade, which represents an innovative and exciting change to how self-reported tests of function are administered. Therefore, the purpose of this report is to provide an update to measures of general functional status commonly employed for people with rheumatologic dis- eases and provide a review of a Computer Adaptive Test- ing measure of functioning for people with osteoarthritis. SF-36 PHYSICAL FUNCTIONING SUBSCALE (PF-10) Description Purpose. The PF-10 is a generic outcome measure de- signed to examine a person’s perceived limitation with physical functioning (1) and is a subscale within the Med- ical Outcomes Study 36-item Short Form Health Survey (SF-36). Content. Subjects are asked if their health limits physi- cal activity, basic mobility, and basic activities of daily living. Number of items in scale. There are 10 items. Response options/scale. Responses are rated on a Likert scale. For the SF-36 versions 1.0 and 2.0, each item is rated on a 3-point scale (yes, limited a lot; yes, limited a little; and no, not limited at all). For the Patient-Reported Out- comes Measurement Information System (PROMIS) ver- sion, each item is rated on a 5-point scale (not at all, very little, somewhat, quite a lot, and cannot do). Recall period for items. Respondents are asked to rate limitation at present for most questions and over the past 4 weeks for other questions. Endorsements. None. Examples of use. The PF-10 was developed as a generic health outcome instrument within the SF-36 for a wide variety of medical conditions in people ages 14 – 61. The PF-10 has been applied to older adult populations, as well as to people with rheumatoid arthritis (RA), back pain, osteoarthritis (OA), and gout (2). Practical Application How to obtain. The PF-10 instrument, scoring manual, and license are available from QualityMetric at www.qualitymetric.com. There is a charge at different rates for commercial and academic use. The PROMIS ver- Dr. White’s work was supported by an American College of Rheumatology Research and Education Foundation Rheumatology Investigator Award, the Boston Claude D. Pepper Older Americans Independence Center (P30- AG031679), and the Foundation for Physical Therapy. Daniel K. White, PT, ScD, Jessica C. Wilson, PT, Julie J. Keysor, PT, PhD: Boston University, Boston, Massachusetts. Address correspondence to Daniel White, PT, ScD, Clini- cal Epidemiology Research and Training Unit, 650 Albany Street, X200, Boston, MA 02118. E-mail: [email protected]. Submitted for publication January 24, 2011; accepted in revised form May 10, 2011. Arthritis Care & Research Vol. 63, No. S11, November 2011, pp S297–S307 DOI 10.1002/acr.20638 © 2011, American College of Rheumatology MEASURES OF FUNCTION S297

Transcript of Measures of adult general functional status: SF-36 Physical Functioning Subscale (PF-10), Health...

Measures of Adult General Functional StatusSF-36 Physical Functioning Subscale (PF-10), Health Assessment Questionnaire (HAQ),Modified Health Assessment Questionnaire (MHAQ), Katz Index of Independence inActivities of Daily Living, Functional Independence Measure (FIM), and Osteoarthritis-Function-Computer Adaptive Test (OA-Function-CAT)

DANIEL K. WHITE, JESSICA C. WILSON, AND JULIE J. KEYSOR

INTRODUCTION

Self-reported measures to assess and quantify functionalstatus are important tools for clinicians and investigators.These measures qualify limitation with different types offunctional activities and quantify the extent of limitation.We are particularly interested in measures of general func-tional status. Although these “generic” measures of func-tion were originally developed in other patient popula-tions, they are relevant to the field of rheumatology. Inparticular, these instruments have been found to be validand reliable measures of function, sensitive to changes infunction, and have distinct thresholds for importantchange in people with rheumatologic disease.

Some notable studies have been added to the literaturefor general functional status measures in the last decade.Most of these additions are in the area of identifyingthresholds for minimum clinically important difference,i.e., the smallest amount of change associated with a min-imally important decline or improvement in function. Toreflect changes in clinical practice over the last decade, wechose to review the Functional Independence Measure,which is a commonly used measure in practice to assessfunction. Also, Computer Adaptive Testing has developedover the past decade, which represents an innovative andexciting change to how self-reported tests of function areadministered. Therefore, the purpose of this report is toprovide an update to measures of general functional statuscommonly employed for people with rheumatologic dis-

eases and provide a review of a Computer Adaptive Test-ing measure of functioning for people with osteoarthritis.

SF-36 PHYSICAL FUNCTIONING SUBSCALE(PF-10)

Description

Purpose. The PF-10 is a generic outcome measure de-signed to examine a person’s perceived limitation withphysical functioning (1) and is a subscale within the Med-ical Outcomes Study 36-item Short Form Health Survey(SF-36).

Content. Subjects are asked if their health limits physi-cal activity, basic mobility, and basic activities of dailyliving.

Number of items in scale. There are 10 items.Response options/scale. Responses are rated on a Likert

scale. For the SF-36 versions 1.0 and 2.0, each item is ratedon a 3-point scale (yes, limited a lot; yes, limited a little;and no, not limited at all). For the Patient-Reported Out-comes Measurement Information System (PROMIS) ver-sion, each item is rated on a 5-point scale (not at all, verylittle, somewhat, quite a lot, and cannot do).

Recall period for items. Respondents are asked to ratelimitation at present for most questions and over the past4 weeks for other questions.

Endorsements. None.Examples of use. The PF-10 was developed as a generic

health outcome instrument within the SF-36 for a widevariety of medical conditions in people ages 14–61. ThePF-10 has been applied to older adult populations, as wellas to people with rheumatoid arthritis (RA), back pain,osteoarthritis (OA), and gout (2).

Practical Application

How to obtain. The PF-10 instrument, scoring manual,and license are available from QualityMetric atwww.qualitymetric.com. There is a charge at differentrates for commercial and academic use. The PROMIS ver-

Dr. White’s work was supported by an American Collegeof Rheumatology Research and Education FoundationRheumatology Investigator Award, the Boston Claude D.Pepper Older Americans Independence Center (P30-AG031679), and the Foundation for Physical Therapy.

Daniel K. White, PT, ScD, Jessica C. Wilson, PT, Julie J.Keysor, PT, PhD: Boston University, Boston, Massachusetts.

Address correspondence to Daniel White, PT, ScD, Clini-cal Epidemiology Research and Training Unit, 650 AlbanyStreet, X200, Boston, MA 02118. E-mail: [email protected].

Submitted for publication January 24, 2011; accepted inrevised form May 10, 2011.

Arthritis Care & ResearchVol. 63, No. S11, November 2011, pp S297–S307DOI 10.1002/acr.20638© 2011, American College of Rheumatology

MEASURES OF FUNCTION

S297

sion of the PF-10 is available for viewing at http://www.nihpromis.org/default.aspx.

Method of administration. Interviewer (in person or bytelephone) or self-administered.

Scoring. Answers to each question are summed to pro-duce raw scores and then transformed to a 0–100 scale.

Score interpretation. Higher scores represent betterhealth status. For the SF-36 version 2.0, the total PF-10score is standardized to a mean of 50. Population normsare available for the US (3) and the UK (4,5). World data forcross-cultural comparisons are available as well (6).

Respondent burden. Less than 10 minutes is needed tocomplete the instrument. Questions are worded at a sixth-to ninth-grade level.

Administrative burden. Less than 10 minutes is neces-sary to administer the instrument and a few minutes areneeded to score the results via computer. No training isrequired.

Translations/adaptations. There are 2 versions of thePF-10: the original SF-36 version 1.0 and the updatedSF-36 version 2.0. Most recently, the PROMIS created a10-item physical functioning scale that has 5 of the samequestions as the PF-10 versions pertaining to vigorousactivities and basic mobility, and 5 questions pertaining tobasic and instrumental activities of daily living (7). How-ever, this review will focus on the SF-36 versions 1.0 and2.0. The SF-36 versions of the PF-10 are available in morethan 50 different languages. More information on the avail-ability of the PF-10 in other languages can be found fromthe International Quality of Life Assessment project atwww.iqola.org.

Psychometric Information

Method of development. PF-10 questions were selectedto assess a variety of physical activities ranging from easyto strenuous. The questionnaire was first examined in agroup of subjects participating in the Medical OutcomesStudy (8).

Acceptability. Missing data are not common. The PF-10was designed to have low ceiling and floor effects.

Reliability. High test–retest reliability has been foundin people with RA (intraclass correlation coefficient [ICC]0.93) (9) and low back pain (ICC 0.83–0.91) (10). Highinternal consistency has also been reported for olderadults (Cronbach’s ! ! 0.82) (11) and people with gout(Cronbach’s ! ! "0.93) (12).

Validity. Criterion validity. The PF-10 has been foundto be associated with both generic and disease-specificmeasures of functional outcome in a variety of rheumato-logic patient populations. For subjects with hip or kneeOA, Salaffi and colleagues reported a high correlation be-tween the PF-10 and the Western Ontario and McMasterUniversities Arthritis Index physical function subscale(r ! #0.65) (13). Similarly, a moderate correlation betweenthe PF-10 and the Timed Up and Go Test was reported insubjects following total hip or knee replacement (r !#0.34) (14). For people from Norway with RA, the PF-10has been found to have strong correlations with the Mod-ified Health Assessment Questionnaire (r ! #0.69) and theArthritis Impact Measurement Scale physical domain (r !

#0.73) (15). Lastly, the PF-10 has been shown to be highlycorrelated with the Late Life Function and Disability Indexin older adults (r ! 0.74–0.88) (2).

Construct validity. The PF-10 has been found to mea-sure a single or unidimensional index in subjects withchronic medical and psychiatric conditions from the US(16), and in people with psoriatic arthritis (17). The PF-10was also found to measure a unidimensional index amongsubjects from the general population from 7 countries,including Denmark, Germany, Italy, the US, Sweden, TheNetherlands, and the UK (18).

Ability to detect change. The PF-10 has been found tobe a sensitive and responsive instrument to change insubjects with RA (9,19), spine pathology (10,20,21), andchronic medical and psychiatric diseases (22). In particu-lar, the PF-10 was able to discriminate between groups ofpeople with RA at different levels of improvement mea-sured by the American College of Rheumatology criteriafollowing a drug trial (19). Similarly, the PF-10 was sensi-tive to change in people with spine pathology undergoingphysical therapy (10). Lastly, using data from subjects withchronic disease within the Medical Outcomes Study,McHorney and colleagues reported that the PF-10 hadsimilar sensitivity to change regardless whether scoreswere Rasch-transformed or not (22).

The minimum clinically important difference (MCID)for the PF-10 has been examined in subjects with spinepathology, specifically intervertebral disc herniation. Inthis patient population, the MCID is reported as rangingbetween 5 and 30 for the PF-10 (20).

Critical Appraisal of Overall Value to theRheumatology Community

Strengths. The PF-10 is an important instrument of gen-eral physical function relevant to the rheumatology com-munity. It evaluates limitations in function common inpeople with rheumatology-related disease and can be usedto evaluate changes following intervention, especially inpeople with RA and spine pathology.

Caveats and cautions. The psychometrics for the PF-10have not been consistently investigated across all rheuma-tologic conditions. For instance, more work is needed toestablish MCID thresholds for the PF-10 in people withRA.

Clinical usability. The psychometric evaluation of thePF-10 does support interpretation of scores for individualpatients and can be employed in the clinic given the shortadministration time.

Research usability. The psychometric evaluation of thePF-10 does support use in intervention studies and obser-vational studies.

HEALTH ASSESSMENT QUESTIONNAIRE(HAQ)

Description

Purpose. This review focuses on the HAQ disabilityindex with an emphasis on the use of the HAQ as a mea-sure of general function (23). The HAQ measures difficulty

S298 White et al

in performing activities of daily living. It is the mostwidely used functional measure in rheumatology. TheHAQ was specifically developed for use among adultswith arthritis, but it has since been used in a wide range ofpopulations (24).

Content. Questions assessing difficulty over the pastweek in 20 specific functions that are grouped into 8categories: dressing and grooming, arising, eating, walk-ing, personal hygiene, reaching, gripping, and other activ-ities.

Number of items. There are 20 items covering 8 sub-scales: dressing and grooming (2 items: dress yourself,including tying shoelaces and fastening buttons, andshampoo your hair); arising (2 items: stand up straightfrom an armless straight chair, get in and out of bed);eating (3 items: cut your meat, lift a full cup or glass toyour mouth, open a new milk carton); walking (2 items:walk outdoors on flat ground, climb up 5 steps); personalhygiene (3 items: wash and dry your entire body, take a tubbath, get on and off the toilet); reaching (2 items: reach andget down a 5-pound object from just above your head, benddown to pick up clothing from the floor); gripping (3 items:open car doors, open jars that have been previouslyopened, turn faucets on and off); and other activities (3items: run errands and shop, get in and out of a car, dochores such as vacuuming or yard work). In addition, theuse of personal assistance, assistive aids, or devices ismeasured.

Response options/scale. Each item is rated from 0–3,where 0 ! no difficulty, 1 ! some difficulty, 2 ! muchdifficulty, and 3 ! unable to do. The highest score withina category is used as the category score. Dependence onphysical assistance or equipment raises the category scoreto 2. The HAQ score is calculated as the mean of the 8category scores. Scores range from 0–3 in increments of0.125. The overall score is not calculated if fewer than 6category scores are completed.

Recall period for items. The past week.Endorsements. None.Examples of use. The HAQ was developed for individ-

uals with rheumatoid arthritis (RA) and osteoarthritis(OA).

Practical Application

How to obtain. The English version of the HAQ and thePatient-Reported Outcomes Measurement InformationSystem (PROMIS) HAQ and scoring directions are pro-vided free of charge at http://aramis.stanford.edu/.

Method of administration. Interviewer (in person or bytelephone) or self-administered (paper or electronic touch-screen version). The touch-screen version is self-explana-tory and accessible for people with reduced motor func-tion (25).

Scoring. The HAQ is hand scored. Alternate methods ofscoring have been developed (for example, scoring with-out taking use of assistance or aids into account [26] orusing the mean category score instead of the highest score[27]), but these scoring methods have not gained wide use.Wolfe suggests that even if alternative scoring methods are

used, the traditional score should also be calculated inorder to be compare with published data (28).

Score interpretation. Higher scores reflect more activitylimitation. The overall estimated normal HAQ score was0.25, with an average of 0.18 for males and 0.28 for femaleswithin a general population sample of 1,530 of people age"30 years in Central Finland (29). Approximately one-third of the respondents reported some sort of disability(HAQ "0). The prevalence of rates of disability increaseexponentially after age 50 years (29).

Respondent burden. Less than 10 minutes are needed tocomplete the HAQ. Questions are worded at a sixth- toninth-grade level.

Administrative burden. Less than 10 minutes areneeded to administer the HAQ, and less than 2 minutes areneeded to score the HAQ. No training is necessary.

Translations/adaptations. Many adaptations and/ortranslations are available, including English (US, Canada,Australia), Belgian Flemish and French, Canadian French,Chinese (Cantonese, Hong Kong), Danish, French, German,Spanish (US, Spain, many Central and South Americancountries), Swedish, and Turkish. For a complete listing,see Bruce and Fries (30). A revised version of the HAQ, theHAQ-II, has been developed and contains 10 items (31). APROMIS HAQ has been developed, which contains thesame 20 items as the original HAQ, but they were qualita-tively improved to increase the clarity and psychometricproperties of the measure (32).

Psychometric Information

Method of development. The HAQ was originally de-veloped by using questions from a variety of instrumentsemployed in the 1970s (23).

Acceptability. Missing data are not common. The HAQhas ceiling limitations, i.e., people with mild functionallimitation can have normal HAQ scores.

Reliability. High test–retest reliability has been foundin subjects with gout. Specifically, the test–retest reliabil-ity of the entire HAQ was intraclass correlation coefficient(ICC) 0.76, with individual subscales ranging from ICC0.68 to ICC 0.80 (33). High correlations between inter-viewer versus self-administered forms of the instrumenthave been reported (range 0.60–0.88) (24), as well as be-tween a touch-screen and paper version (ICC 0.99) (25).

Validity. For criterion validity, Daltroy et al (34) founda strong correlation (#0.72) between HAQ scores and aphysical capacity measure in older adults.

For construct validity, HAQ scores are comparableacross people with RA, OA, or gout using item responsetheory, which suggests the HAQ measures a single under-lying construct of disability (35). Several studies haveshown significant correlations of HAQ scores with othermeasures of function (e.g., Arthritis Impact MeasurementScale and Western Ontario and McMaster Universities Ar-thritis Index [WOMAC]) supporting the HAQ as a validmeasure of general function (30,36–38).

Ability to detect change. The HAQ is a sensitive andresponsive measure to changes in function in people withknee or hip OA. For people undergoing hip or knee jointreplacement, the HAQ is responsive to functional change

Adult General Function Status Measures S299

following surgery (39). Similarly, the HAQ has been foundto be more sensitive to change over 3 years in people withhip or knee OA than the WOMAC (38). The HAQ has beenfound to have a ceiling effect, i.e., it does not discriminatewell between people with low levels of disability (31,40).The minimum clinically important difference (MCID) forthe HAQ has been examined in a variety of rheumatologic-related populations including RA, psoriatic arthritis, sys-temic lupus erythematosus, spondylarthropathies, andscleroderma. The range for MCID is #0.08 to #0.25 forimprovement and 0.13 to 0.22 for decline (41–47). Severalauthors have commented that MCID values may dependon the severity of disability. Specifically, less change wasneeded to meet a meaningful threshold for improvementfor people with low levels of disability compared withthose with a high level of disability (41,43,46).

Critical Appraisal of Overall Value to theRheumatology Community

Strengths. The HAQ measures important limitations infunction relevant to many people with rheumatology-re-lated disorders. Given that MCID values have been estab-lished for multiple rheumatologic populations, the HAQ isappropriate for evaluating interventions. It is notable thatall studies investigating MCID reported a similar range ofvalues, making the HAQ a useful measurement of functionand change in function.

Caveats and cautions. The reliability, validity, and re-sponsiveness of the HAQ requires more investigation. Spe-cifically, the psychometrics of the HAQ need to be estab-lished across more rheumatologic patient populations.Clinical investigators should be aware that the HAQ doeshave floor effects, and may be less responsive to changeamong individuals with low levels of disability.

Clinical usability. The psychometric evaluation of theHAQ does support interpretation of scores for individualpatients with moderate to severe disability and can beemployed in the clinic given the short administrationtime.

Research usability. The psychometric evaluation of theHAQ does support use in intervention studies and obser-vational studies, although sensitivity to change will likelybe limited in people with mild disability.

MODIFIED HEALTH ASSESSMENTQUESTIONNAIRE (MHAQ)

Description

Purpose. The MHAQ is a modified version of the HAQ(48).

Content. The number of specific activities queried isreduced from 20 to 8 (1 item is used from each of the 8categories covered in the HAQ). The MHAQ has 4 sub-scales that assess degree of difficulty, satisfaction withfunction, change in function over the past 6 months, andperceived need for help with each activity. The degree ofdifficulty subscale is the most commonly used.

Number of items. There are 8 items (dressing, arising,eating, walking, hygiene, reaching, gripping, and getting inand out of car) repeated in each of the 4 subscales.

Response options/scale. For the difficulty subscale(“Are you able to. . .?”), the scale is 0 ! without anydifficulty, 1 ! with some difficulty, 2 ! with much diffi-culty, and 3 ! unable to do. Any positive response regard-ing help or assistive devices raises the score to 2. Forsatisfaction (“How satisfied are you with your abilityto. . .?”), 0 ! satisfied and 1 ! dissatisfied. For change indifficulty (“Compared to 6 months ago, how difficult is itnow [this week] to. . .?”), 0 ! less difficult now, 1 ! nochange, and 2 ! more difficult now. For need for help (Doyou need help to. . .?”), 0 ! do not need help and 1 ! needhelp. Scale scores are the mean of the scores on the 8 itemswithin the scale: difficulty 0–3, satisfaction 0–1, change infunction 0–2, and need for help 0–1.

Recall period for items. Up to 6 months.Endorsements. None.Examples of use. People with rheumatic conditions

(48).

Practical Application

How to obtain. Available in original reference (48).Method of administration. Interviewer or self-adminis-

tered.Scoring. Arithmetic calculation by hand.Score interpretation. Higher scores reflect poorer

health.Respondent burden. Less than 5 minutes are needed to

complete the MHAQ. Questions are worded at a sixth- toninth-grade level.

Administrative burden. Less than 5 minutes are neededto administer the MHAQ, and less than 2 minutes areneeded to score the MHAQ. No training is necessary.

Translations/adaptations. Two subsequent versions ofthe HAQ have been developed, the MultidimensionalHealth Assessment Questionnaire (48), and the HAQ-II(31). Both instruments were developed to address ceilingproblems associated with the MHAQ (40).

Psychometric Information

Method of development. Questions from the MHAQ aredirectly from the HAQ.

Acceptability. Missing data are not common. TheMHAQ has floor limitations and ceiling limitations (48).

Reliability. The test–retest reliability for the difficultyscale over 1 month was reported as 0.91 (48).

Validity. For concurrent validity for the difficulty scale,the MHAQ is highly correlated with the overall score ofthe HAQ (0.88), the Arthritis Impact Measurement Scalephysical component (0.80), and the Short Form 36 physi-cal function scale (0.71) in people with rheumatoid arthri-tis (40). Blalock and colleagues also examined the equiv-alency of the HAQ with the MHAQ, and found thatalthough the scores were highly correlated, the MHAQscores were consistently and significantly lower (indicat-ing better function) than the HAQ score (49). Uhlig andcoauthors also found large numerical differences in scores,

S300 White et al

especially at higher disability levels (40). In every cate-gory, HAQ items chosen for the MHAQ had a lower meanthan the MHAQ-excluded items (49). For construct valid-ity for the difficulty scale, the MHAQ scores have beenfound to be associated with measures of physical perfor-mance (e.g., walk test, grip strength) (50). For constructvalidity for dissatisfaction with function scale, scores wereincrementally greater (more dissatisfied) as difficulty infunction increased (48).

Ability to detect change. For the difficulty scale, Bla-lock and colleagues suggest that the MHAQ is relativelyinsensitive to low levels of disability, and because of itsrestricted range and skewed distribution, should be usedwith caution when the intent is to assess functional change(49). Uhlig et al also reported a considerable ceiling effectfor the MHAQ (40). Stucki et al (scores $0.3 [51]) andWolfe (scores #1.0 [28]) also noted clustering of scores atthe low end of the scale. Ziebland et al found that theMHAQ change in difficulty scale was more sensitive tochanges in clinical variables (i.e., correlated more highlywith variables such as grip strength, pain, morning stiff-ness, and erythrocyte sedimentation rate) than a pre-postdifference in the traditional HAQ score (52).

Critical Appraisal of Overall Value to theRheumatology Community

Strengths. Similar to the HAQ, the MHAQ measuresimportant limitations in function relevant to many peoplewith rheumatology-related disorders. Given its abbrevi-ated form, the MHAQ should be considered when the fullversion of the HAQ cannot be implemented due to timeconstraints.

Caveats and cautions. The majority of psychometric ana-lysis of the MHAQ has focused on the difficulty subscale,and has generally found that it appears to be less psycho-metrically sound than the HAQ. Blalock et al noted thatscores on the MHAQ were consistently lower than thoseon the HAQ (49). Mean differences on the overall difficultyscore were 0.67 lower using HAQ scores calculated withadjustment for help and/or assistive devices, and 0.52lower using HAQ scores without such adjustments. TheMHAQ does not make adjustments for use of help orassistive devices. Blalock also noted that while the HAQscores were normally distributed across the scale’s fullpossible range (0–3), MHAQ scores were not normallydistributed and ranged only from 0–1.75. Similar findingswere also noted by Stucki et al (51) and Wolfe (28). TheMHAQ also has a considerable ceiling effect, which isgreater than that of the HAQ (40). There are conflictingreports about correlations between MHAQ scores and clin-ical and laboratory variables. Wolfe concluded that theadvantages in the length of the MHAQ over the HAQ wereoffset by loss of sensitivity and responsiveness to change(28).

Clinical usability. The psychometric evaluation of theMHAQ does support limited use in the clinic, however,floor and ceiling effects should be considered when inter-preting scores.

Research usability. Given the MHAQ’s limited abilityto detect change, research use is not recommended.

KATZ INDEX OF INDEPENDENCE INACTIVITIES OF DAILY LIVING

Description

Purpose. To quantify independence in activities ofdaily living (ADL) across a wide range of patient popula-tions (53).

Content. Basic ADL (bathing, dressing, toileting, trans-fers, continence, and feeding). Katz et al noted that the lossof functional skills occurs in a specific order, with themost complex lost first (54). The scoring method for thisscale reflects this hierarchy of function.

Number of items. 6, 1 for each ADL.Response options/scale. Each ADL is scored on a

3-point scale of independence. Items are ordered by diffi-culty. The scoring reflects this, although some variation inthe hierarchy of difficulty is allowed. Katz reported thatADL functions of 86% of evaluated subjects were consis-tent with the hierarchy (54). Score range is A–G or 0–6.

Recall period for items. Immediate.Endorsements. None.Examples of use. The Katz Index of ADL has been used

in older adults (55), people with stroke (56), and olderadults with hip fracture (57).

Practical Application

How to obtain. Available from original reference (54)and at www.npcrc.org/resources/resources_show.htm?do-c_id!376169.

Method of administration. Examiner-administered viaobservation of the patient.

Scoring. Independence in various combinations of ADLdetermines ordinal rank on the alpha scale, or the numberof ADLs for which the individual is dependent for thenumeric scale. Ratings are made are on an 8-level ordinalscale, where A ! independence in feeding, continence,transferring, going to toilet, dressing, and bathing; B !independent in all but 1 of these functions; C ! indepen-dent in all but bathing and 1 additional function; D !independent in all but bathing, dressing, and 1 additionalfunction; E ! independent in all but bathing, dressing,going to toilet, and 1 additional function; F ! independentin all but bathing, dressing, going to toilet, transferring,and 1 additional function; G ! dependent in all 6 func-tions; and other ! dependent in at least 2 functions, butnot classifiable as C, D, E, or F. Katz and Akpom laterproposed a simplified scoring system in which individualsare scored 0–6, reflecting the number of ADLs in whichthey are dependent (58).

Score interpretation. Scores reflect the specific ADLs ornumber of dependent ADLs. Higher (alphabetically or nu-merically) scores reflect greater independence.

Respondent burden. Five minutes to complete. Instru-ment is performance based.

Administrative burden. Must observe the patient ineach ADL to determine level of independence.

Translations/adaptations. The Katz Index of ADL hasbeen adapted into several versions that are comparable tothe original (59,60), while others have been modified

Adult General Function Status Measures S301

(61,62). The Katz Index of ADL has also been translatedinto Spanish (63).

Psychometric Information

Method of development. The Katz Index of ADL wasdeveloped from the observations of inpatients with hipfractures. Observations were made by physicians, nurses,and other health professionals (54).

Acceptability. The Katz Index of ADL measures onlybasic ADLs, and therefore has ceiling effects, i.e., the indexcannot discriminate well among people with no and mildlimitations.

Reliability. The interrater reliability is 0.95 or betterafter training (54,64). The coefficient of reproducibility (ameasure of the internal consistency of an ordered measure)is 0.96–0.99 (65). In a study examining the reliability andvalidity of self-reported limitations in ADL among Turk-ish, Moroccan, and indigenous Dutch elderly in The Neth-erlands, Reijneveld et al reported that internal consistencyreliabilities were good for all ethnic groups, being slightlyhigher for Turkish and Moroccan elderly people than forDutch elderly (66).

Validity. Regarding construct validity, the Katz Index ofADL is associated with scores from the Barthel Index (r !0.78 [67], $ ! 0.77 [68]). The Spanish versions of the KatzIndex of ADL are associated with mortality, institutional-ization, and utilization of social health services (63). Forpredictive validity, the Katz Index of ADL is associatedwith mobility dysfunction (0.50) and house confinement(0.39) among older patients 2 years later (69). There is alsoa correlation between ADL dependency level and mortal-ity among nursing home residents (64). Comparing pa-tients at 1-month poststroke, those with grade A-B-C atadmission were more likely to go home compared withthose with a grade of D-E-F-G (56).

Ability to detect change. The scale had a significantfloor effect, in that it is relatively insensitive to variationsat low levels of disability (36). Scores on the Katz ADLscale are dependent on the physical environment, i.e.,different scores may be obtained for individuals in differ-ent settings or with different environmental modifications(37).

Critical Appraisal of Overall Value to theRheumatology Community

Strengths. The Katz Index of ADL measures importantfunctional limitations, which can occur in rheumatologicpatient populations.

Caveats and cautions. There has been little investiga-tion of sensitivity and responsiveness of the Katz Index ofADL. Most problematic is potential for ceiling effects withpeople with mild limitations in ADLs. This could lead tothe index not being responsive to changes in ADLs inpeople with low levels of disability.

Clinical usability. The psychometric evaluation pro-vides some support for the clinical use of the Katz Index ofADL, however, more robust measures of ADL function,such as the Functional Independence Measure should beconsidered.

Research usability. Use of the Katz Index of ADL inresearch studies is not well supported.

FUNCTIONAL INDEPENDENCE MEASURE (FIM)

Description

Purpose. The FIM estimates the level of assistanceneeded for patients to complete basic activities of dailyliving (ADL) (70). The FIM was designed to be an assess-ment tool that could be implemented universally across allpatient populations within an inpatient rehabilitation hos-pital environment (70).

Content. The FIM includes 18 basic ADLs, such as self-care, sphincter control, transfers, locomotion, communi-cation, and social cognition. Clinicians score patients on a7-point scale ranging from dependent to independent,which reflects the level of assistance needed to completeeach ADL.

Number of items. The FIM items are organized into themotor and cognitive domains, which are further organizedinto 4 subscales for the motor domain and 2 subscales forthe cognitive domain.

Response options/scale. A trained health professionalrates a patient on a scale of 1–7, where 1 ! total assistance(the patient provides $25% effort to complete each task),2 ! maximal assistance (25–49% effort), 3 ! moderateassistance (50–74% effort), 4 ! minimal assistance ("75%effort), 5 ! supervision/set up (need for supervision but nophysical contact), 6 ! modified independence (use of adevice or need for more than a reasonable time to completeeach task), and 7 ! complete independence (the patientcompletes each task in a timely and safe manner). Differenthealth professionals can score sections specific to theirdiscipline. For instance, a physical therapist can score themobility-related items for a patient while an occupationaltherapist scores the ADL-related items. There are 2 grossscore classifications: dependent (helper: scores 1–5) andindependent (no helper: scores 6–7). The total FIM score iscalculated by summing the score of each of the 18 items.

Recall period for items. Immediate.Endorsements. The FIM is used to determine payment

for inpatient acute rehabilitation services from the Centersfor Medicare and Medicaid Services. In particular, the FIMis used to determine coverage for patients under MedicarePart A.

Examples of use. Individuals within the inpatient acuterehabilitation hospital setting.

Practical Application

How to obtain. The FIM System program is available athttp://www.udsmr.org/. A sample of the FIM instrumentcan be found at http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book!physmedrehab&part!A11332&rendertype!figure&id!A11340.

Method of administration. Observation by members ofan interdisciplinary team.

Scoring. Specific scoring instructions apply to the FIM.Training manuals for scoring are available from the Cen-

S302 White et al

ters from Medicare and Medicaid Services, http://www.cms.gov/InpatientRehabFacPPS/04_IRFPAI.asp.

Score interpretation. Scores range from 18–126. Higherscores represent more independence. A score of 18 repre-sents complete dependence, while a score of 126 repre-sents complete independence. The total FIM score is ap-propriate to report if the goal of assessment is to determinethe overall burden of care (71). There are 2 domains of theFIM: motor and cognitive. The motor domain subscalesinclude self-care (6 items: eating, grooming, bathing, dress-ing upper body, dressing lower body, and toileting);sphincter control (2 items: bladder management, bowelmanagement); transfers (3 items: bed/chair/wheelchair,toilet, tub/shower); and locomotion (2 items: walk orwheelchair, stairs). The motor domain was developedfrom the Barthel Index (72). The cognitive domain sub-scales include communication (2 items: comprehension,expression) and social cognition (3 items: social interac-tion, problem solving, memory). The mean % SD admis-sion FIM total was 73.2 % 12.9 and discharge FIM total was101.7 % 12.9 for patients with lower extremity joint re-placement who were discharged from a rehabilitation pro-gram in 2007 (73).

Respondent burden. 30–45 minutes to perform all ac-tivities. Patients are asked to perform each functional taskin order to generate a score, which may be difficult.

Administrative burden. 7 minutes to collect demo-graphic data and 10 minutes to score. Formal training isneeded to administer the FIM. A training examination isavailable at: http://www.udsmr.org/.

Translations/adaptations. The FIM has been translatedinto different languages including Italian and Turkish(74,75).

Psychometric Information

Method of development. The FIM was created to pro-vide an improvement over the Barthel Index. It has beendeveloped and tested mainly in people with neurologicpathology.

Acceptability. Missing data are not common. The in-strument has some ceiling effects within each of the motorand cognitive domains.

Reliability. High reliability has been reported for theFIM. In a quantitative review of 11 studies, Ottenbacher etal reported high interrater and test–retest reliability forhealth professionals with a variety of educational back-grounds and levels of training (76). Based on 1,568 pa-tients with a variety of medical diagnoses, the medianinterrater reliability was 0.95 and test–retest reliabilitywas 0.95. Median reliability for the 6 subscales rangedfrom 0.78 (social cognition) to 0.95 (self-care), and the 18individual items ranged from 0.61 (comprehension) to0.90 (toilet transfer). Pollak and colleagues also found hightest–retest reliability for the motor (intraclass correlationcoefficient [ICC] 0.90) and cognitive domains (ICC 0.80) ina cohort of older adults age "80 years residing in a mul-tilevel retirement community (77).

High internal consistency was found for the total FIMscore (Cronbach’s ! ! 0.88–0.97) (71,78), the motor do-main (! ! 0.84–0.97) (71,79), and the cognitive domain

(! ! 0.86–0.95) (71) within a large sample of inpatientsundergoing acute rehabilitation with various diagnoses.However, lower internal consistency was reported for thelocomotion subscale (! ! 0.68), suggesting that the indi-vidual items (ambulation/wheelchair use and stair climb-ing) may be measuring a different latent construct of func-tion (78). Internal consistency was also high for FIM scoresobtained via interview (! ! 0.94) or observation (! ! 0.90)(80).

Validity. Regarding concurrent validity, FIM scores as-signed by a single nonclinician interview and by observa-tion by a team of health care professionals were similar(ICC 0.74 for admission FIM and ICC 0.76 for dischargeFIM), which provides evidence that a multi-interviewer–administered FIM is a valid method for collecting data(81). For construct validity, the separation of the FIM intomotor and cognitive domains has been found to be a validmethod of measuring activity limitation (82–84). Theitems in each domain show a generally consistent patternof difficulty rating across multiple medical diagnoses, witheating the least difficult motor item to achieve an indepen-dent rating, and stair climbing the most difficult (82–84).For cognitive items, expression is the least difficult andproblem solving is the most difficult (82,84). FIM scoresare correlated with age, comorbidity, and discharge desti-nation (78), as well as other functional measures, such asthe Barthel Index and the Functional Assessment Measure(79,80,85,86).

While little work has examined the predictive validity ofthe FIM within rheumatologic patient populations, severalstudies have examined this within stroke. Trends fromthese studies can be carefully considered for patients withrheumatologic conditions who are at an inpatient rehabil-itation hospital. Admission FIM scores have been shownto predict length of stay and discharge FIM scores in arehabilitation hospital following stroke (80,87–91). In par-ticular, an increase in the admission score of the motordomain by 1 point is correlated with a 1.1-day decrease inaverage rehabilitation length of stay for patients withstroke (87). There is a strong association between total FIMscores and discharge destination, i.e., discharge home ver-sus skilled nursing facility (90,92–95). A majority of pa-tients with stroke with admission FIM scores "80 aredischarged home, while less than half with admission FIMscores $40 are discharged home, regardless of age (94).Social support has been shown to be a decisive factor fordischarge destination, especially for those requiring highlevels of assistance (90,93).

Ability to detect change. The FIM, especially the motorFIM, is highly responsive in detecting changes in ADLperformance (78,80,96), but the cognitive FIM has a poorresponsiveness due to its significant ceiling effect seenacross a wide variety of medical diagnoses (96–99). Thereis comparable responsiveness between the FIM and theBarthel Index (79,80,85,96,100). Beninato et al reported aminimum clinically important difference for the total FIMof 22, the motor FIM of 17, and the cognitive FIM of 3 inthe stroke population when anchored to a physician’sassessment of minimally clinically important change(101).

Adult General Function Status Measures S303

Critical Appraisal of Overall Value to theRheumatology Community

Strengths. The FIM is a widely used tool in the rehabil-itation setting across a broad range of medical diagnosesincluding rheumatologic diagnoses. The FIM is appropri-ate for evaluating interventions for people with severefunctional limitation.

Caveats and cautions. The FIM is not intended for com-munity-dwelling adults who are independent in mostfunctional activities. Future work is needed to validate thepredictive validity of the FIM within rheumatologic pa-tient populations.

Clinical usability. The psychometric evaluation doessupport interpretation of scores for individuals with se-vere functional limitation. Clinical use is primarily donein an inpatient acute rehabilitation setting.

Research usability. The psychometric evaluation doessupport use of the FIM within intervention studies andobservational studies.

OSTEOARTHRITIS-FUNCTION-COMPUTERADAPTIVE TEST (OA-FUNCTION-CAT)

Description

Purpose. The OA-FUNCTION-CAT employs computeradaptive testing to estimate a respondent’s level of func-tioning. It was developed as a disease-specific measure forpeople with hip or knee OA (102).

Content. The OA-FUNCTION-CAT utilizes an itembank of 125 functional activities specific to hip or kneeOA.

Number of items in scale. The OA-FUNCTION-CAT se-lects 5, 10, or 15 items from the 125-item bank for admin-istration.

Recall period for items. Over the past month on anaverage day.

Endorsements. None.Examples of use. The OA-Function-CAT was developed

in a hip and knee OA cohort of subjects (102).

Practical Application

How to obtain. Contact CREcare (http://www.crecare.com/home.html) regarding cost and availability of the in-strument. The 125-item bank is available for no fee at http://www.biomedcentral.com/content/supplementary/ar2760-S1.doc.

Method of administration. A CAT tailors assessment toeach individual by selecting and administering subse-quent questions based on the individual’s response to theprevious question. The program begins by selecting a ques-tion from the middle of the continuum of the calibrateditem bank. Based on how the respondent answers thequestion, the computer calculates an initial score and levelof precision. The CAT will conclude the test based onpredetermined stop rules based on level of precisionand/or a maximum number of items that are to be used toestimate the score. After the first question is answered, theprogram decides if the stop rule has been met. If not,

another question is selected from the item bank based onthe answer given for the previous question. This process isrepeated until the stop rule has been satisfied, and a finalscore is calculated. This approach allows for the selectionof items that provide the most relevant information at thelevel of the individual’s current score estimate, thereforeeliminating irrelevant questions from being asked (102–104).

Scoring. Continuous scale. For the functional difficultyscale, items are reported in terms of amount of difficulty inperforming each function (none, a little, or a lot). For thefunctional pain scale, items are reported in terms of painseverity in performing each function (none, mild or mod-erate, or severe). The computer automatically calculates anoutcome score representing how much limitation the in-dividual has within the spectrum of functional limitation.This score is based on the individual’s response to each ofthe questions asked.

Interpretation of scores. Scores range from 0–100.Higher scores represent higher function and less pain. Thescore produced on the CAT can be compared to otherOA-FUNCTION-CAT scores regardless of the specificquestions that were asked to generate the score. The OA-FUNCTION-CAT calculates a functional outcome scorethat can be compared within and between respondents.

Respondent burden. 15 or fewer questions are asked(questions written on a sixth-grade level of comprehen-sion).

Administrative burden. Minimal burden since the com-puter program calculates the score in real time, so thescore is available immediately.

Translations/adaptations. None.Training to interpret. Not reported.

Psychometric Information

Reliability. There is high level of accuracy between the5-, 10-, and 15-item OA-FUNCTION-CATs and the fullitem bank (Pearson’s r ! 0.92, 0.96, and 0.97, respectively,for the functional difficulty subscale and 0.89, 0.95, and0.97, respectively, for the functional pain subscale) amongpeople with hip or knee OA. There is high conditionalreliability, i.e., examinee level reliability (105), for boththe functional difficulty and functional pain subscales(95% of the sample scores achieved reliability estimates"0.97 and "0.96, respectively) (102).

Validity. Regarding construct validity, both the func-tional difficulty and functional domain subgroups fit aunidimensional model. Both of the OA-FUNCTION-CATsubscales cover a broader estimated scoring range than theWestern Ontario and McMaster Universities OsteoarthritisIndex (WOMAC), especially at the upper, i.e., higher func-tioning, end of the scale. The OA-FUNCTION-CAT hadless of a ceiling effect than the WOMAC (0.6% of subjectswere at the ceiling for the OA-FUNCTION-CAT functionalpain subscale versus 6.4% for the WOMAC pain scale, and0.6% of subjects were at the ceiling for the OA-FUNC-TION-CAT functional difficulty subscale versus 3.0% forthe WOMAC physical function scale). The OA-FUNC-TION-CAT did not have a floor effect (102).

S304 White et al

Ability to detect change. The 10-item OA-FUNCTION-CAT has a higher degree of precision than the WOMACacross the full range of scores for both subscales, especiallyat the upper end of the scale in the functional pain sub-scale within people with hip or knee OA (102).

Critical Appraisal of Overall Value to theRheumatology Community

Strengths. The OA-FUNCTION-CAT is an innovativemethod of measuring patient reported outcomes relevantto people with rheumatologic related disorders. The OA-FUNCTION-CAT has improved psychometric propertiesand requires fewer questions compared with legacy mea-sures. Specifically, CATs offer a highly reliable and precisemethod to quantify patient reported limitations along abroad continuum. In addition, CAT scores can be esti-mated after only a few questions are answered, whichdecreases overall time and cost of administration.

Caveats and cautions. Future work is needed to exam-ine the test–retest reliability of the OA-FUNCTION-CAT;utilization of CAT methods for estimating patient-reportedoutcomes is likely to increase among clinicians and re-searchers.

Clinical usability. The psychometric evaluation of theOA-FUNCTION-CAT supports interpretation of scores tomake decisions about individuals. Given the minimal bur-den on patients and clinicians, the OA-FUNCTION-CAT isvery appropriate to use clinically.

Research usability. The OA-FUNCTION-CAT can beused within intervention trials and observational studiesgiven the psychometrics of this instrument. Values repre-senting meaningful change have yet to be establishedwhich may limit clinical and research application.

AUTHOR CONTRIBUTIONSAll authors were involved in drafting the article or revising it

critically for important intellectual content, and all authors ap-proved the final version to be published.

REFERENCES

1. Ware JE Jr, Sherbourne CD. The MOS 36-item Short-Form healthsurvey (SF-36). I. Conceptual framework and item selection. Med Care1992;30:473–83.

2. Dubuc N, Haley S, Ni P, Kooyoomjian J, Jette A. Function and disabil-ity in late life: comparison of the Late-Life Function and DisabilityInstrument to the Short-Form-36 and the London Handicap Scale.Disabil Rehabil 2004;26:362–70.

3. Ware JE Jr, Kosinski M, Bayliss MS, McHorney CA, Rogers WH,Raczek A. Comparison of methods for the scoring and statistical ana-lysis of SF-36 health profile and summary measures: summary ofresults from the Medical Outcomes Study. Med Care 1995;33 Suppl:AS264–79.

4. Jenkinson C, Coulter A, Wright L. Short form 36 (SF36) health surveyquestionnaire: normative data for adults of working age. BMJ 1993;306:1437–40.

5. Bowling A, Bond M, Jenkinson C, Lamping DL. Short Form 36 (SF-36)Health Survey questionnaire: which normative data should be used?Comparisons between the norms provided by the Omnibus Survey inBritain, the Health Survey for England and the Oxford Healthy LifeSurvey. J Public Health Med 1999;21:255–70.

6. Ware JE Jr, Gandek B, Kosinski M, Aaronson NK, Apolone G, BrazierJ, et al. The equivalence of SF-36 summary health scores estimatedusing standard and country-specific algorithms in 10 countries: resultsfrom the IQOLA Project. International Quality of Life Assessment.J Clin Epidemiol 1998;51:1167–70.

7. Bruce B, Fries JF, Ambrosini D, Lingala B, Gandek B, Rose M, et al.Better assessment of physical function: item improvement is neglectedbut essential. Arthritis Res Ther 2009;11:R191.

8. Stewart AL, Ron DH, Ware JE Jr. The MOS Short-Form General HealthSurvey: reliability and validity in a patient population. Med Care1988;26:724–35.

9. Ruta DA, Hurst NP, Kind P, Hunter M, Stubbings A. Measuring healthstatus in British patients with rheumatoid arthritis: reliability, validityand responsiveness of the short form 36-item health survey (SF-36).Br J Rheumatol 1998;37:425–36.

10. Davidson M, Keating JL. A comparison of five low back disabilityquestionnaires: reliability and responsiveness. Phys Ther 2002;82:8–24.

11. Bohannon RW, DePasquale L. Physical Functioning Scale of theShort-Form (SF) 36: internal consistency and validity with olderadults. J Geriatr Phys Ther 2010;33:16–8.

12. Ten Klooster PM, Oude Voshaar MA, Taal E, van de Laar MA. Com-parison of measures of functional disability in patients with gout.Rheumatology (Oxford) 2011;50:709–13.

13. Salaffi F, Carotti M, Grassi W. Health-related quality of life in patientswith hip or knee osteoarthritis: comparison of generic and disease-specific instruments. Clin Rheumatol 2005;24:29–37.

14. Gandhi R, Tsvetkov D, Davey JR, Syed KA, Mahomed NN. Relation-ship between self-reported and performance-based tests in a hip andknee joint replacement population. Clin Rheumatol 2009;28:253–7.

15. Kvien TK, Kaasa S, Smedstad LM. Performance of the NorwegianSF-36 Health Survey in patients with rheumatoid arthritis II: a com-parison of the SF-36 with disease-specific measures. J Clin Epidemiol1998;51:1077–86.

16. Haley SM, McHorney CA, Ware JE Jr. Evaluation of the MOS SF-36physical functioning scale (PF-10) I: unidimensionality and reproduc-ibility of the Rasch item scale. J Clin Epidemiol 1994;47:671–84.

17. Taylor WJ, McPherson KM. Using Rasch analysis to compare thepsychometric properties of the Short Form 36 physical function scoreand the Health Assessment Questionnaire disability index in patientswith psoriatic arthritis and rheumatoid arthritis. Arthritis Rheum2007;57:723–9.

18. Raczek AE, Ware JE, Bjorner JB, Gandek B, Haley SM, Aaronson NK,et al. Comparison of Rasch and summated rating scales constructedfrom SF-36 physical functioning items in seven countries: results fromthe IQOLA Project. International Quality of Life Assessment. J ClinEpidemiol 1998;51:1203–14.

19. Martin M, Kosinski M, Bjorner JB, Ware JE Jr, Maclean R, Li T. Itemresponse theory methods can improve the measurement of physicalfunction by combining the modified health assessment questionnaireand the SF-36 physical function scale. Qual Life Res 2007;16:647–60.

20. Spratt KF. Patient-level minimal clinically important difference basedon clinical judgment and minimally detectable measurementdifference: a rationale for the SF-36 physical function scale in theSPORT intervertebral disc herniation cohort. Spine (Phila Pa 1976)2009;34:1722–31.

21. Patrick DL, Deyo RA, Atlas SJ, Singer DE, Chapin A, Keller RB.Assessing health-related quality of life in patients with sciatica. Spine(Phila Pa 1976) 1995;20:1899–908.

22. McHorney CA, Haley SM, Ware JE Jr. Evaluation of the MOS SF-36Physical Functioning Scale (PF-10) II: comparison of relative preci-sion using Likert and Rasch scoring methods. J Clin Epidemiol 1997;50:451–61.

23. Fries JF, Spitz PW, Young DY. The dimensions of health outcomes:the health assessment questionnaire, disability and pain scales. 1982;9:789–93.

24. Fries JF, Spitz P, Kraines RG, Holman HR. Measurement of patientoutcome in arthritis. Arthritis Rheum 1980;23:137–45.

25. Schefte DB, Hetland ML. An open-source, self-explanatory touchscreen in routine care: validity of filling in the Bath measures onAnkylosing Spondylitis Disease Activity Index, Function Index, theHealth Assessment Questionnaire and Visual Analogue Scales in com-parison with paper versions. Rheumatology (Oxford) 2010;49:99–104.

26. Van der Heide A, Jacobs JW, van Albada-Kuipers GA, Kraaimaat FW,Geenen R, Bijlsma JW. Self report functional disability scores and theuse of devices: two distinct aspects of physical function in rheumatoidarthritis. Ann Rheum Dis 1993;52:497–502.

27. Tomlin GS, Holm MB, Rogers JC, Kwoh CK. Comparison of standardand alternative health assessment questionnaire scoring proceduresfor documenting functional outcomes in patients with rheumatoidarthritis. J Rheumatol 1996;23:1524–30.

28. Wolfe F. Which HAQ is best? A comparison of the HAQ, MHAQ andRA-HAQ, a difficult 8 item HAQ (DHAQ), and a rescored 20 item HAQ(HAQ20): analyses in 2,491 rheumatoid arthritis patients followingleflunomide initiation. J Rheumatol 2001;28:982–9.

29. Krishnan E, Sokka T, Hakkinen A, Hubert H, Hannonen P. Normativevalues for the Health Assessment Questionnaire Disability Index:

Adult General Function Status Measures S305

benchmarking disability in the general population. Arthritis Rheum2004;50:953–60.

30. Bruce B, Fries JF. The Stanford Health Assessment Questionnaire: areview of its history, issues, progress, and documentation. J Rheuma-tol 2003;30:167–78.

31. Wolfe F, Michaud K, Pincus T. Development and validation of theHealth Assessment Questionnaire II: a revised version of the healthassessment questionnaire. Arthritis Rheum 2004;50:3296–305.

32. Fries JF, Cella D, Rose M, Krishnan E, Bruce B. Progress in assessingphysical function in arthritis: PROMIS short forms and computerizedadaptive testing. J Rheumatol 2009;36:2061–6.

33. Alvarez-Hernandez E, Pelaez-Ballestas I, Vazquez-Mellado J, Teran-Estrada L, Bernard-Medina AG, Espinoza J, et al. Validation of theHealth Assessment Questionnaire Disability Index in patients withgout. Arthritis Rheum 2008;59:665–9.

34. Daltroy LH, Larson MG, Eaton HM, Phillips CB, Liang MH. Discrep-ancies between self-reported and observed physical function in theelderly: the influence of response shift and other factors. Soc Sci Med1999;48:1549–61.

35. Van Groen MM, ten Klooster PM, Taal E, van de Laar MA, Glas CA.Application of the Health Assessment Questionnaire disability indexto various rheumatic diseases. Qual Life Res 2010;19:1255–63.

36. McDowell I, Newell C. Measuring health: a guide to rating scales andquestionnaires. 2nd ed. New York: Oxford University Press; 1996.

37. Spilker B. Quality of life and pharmacoeconomics in clinical trials.2nd ed. Philadelphia: Lippincott-Raven; 1996.

38. Bruce B, Fries J. Longitudinal comparison of the Health AssessmentQuestionnaire (HAQ) and the Western Ontario and McMaster Univer-sities Osteoarthritis Index (WOMAC). Arthritis Rheum 2004;51:730–7.

39. Liang MH, Larson MG, Cullen KE, Schwartz JA. Comparative mea-surement efficiency and sensitivity of five health status instrumentsfor arthritis research. Arthritis Rheum 1985;28:542–7.

40. Uhlig T, Haavardsholm EA, Kvien TK. Comparison of the HealthAssessment Questionnaire (HAQ) and the modified HAQ (MHAQ) inpatients with rheumatoid arthritis. Rheumatology (Oxford) 2006;45:454–8.

41. Redelmeier DA, Lorig K. Assessing the clinical importance of symp-tomatic improvements: an illustration in rheumatology. Arch InternMed 1993;153:1337–42.

42. Kosinski M, Zhao SZ, Dedhiya S, Osterhaus JT, Ware JE Jr. Determin-ing minimally important changes in generic and disease-specifichealth-related quality of life questionnaires in clinical trials of rheu-matoid arthritis. Arthritis Rheum 2000;43:1478–87.

43. Pope JE, Khanna D, Norrie D, Ouimet JM. The minimally importantdifference for the health assessment questionnaire in rheumatoid ar-thritis clinical practice is smaller than in randomized controlled trials.J Rheumatol 2009;36:254–9.

44. Kwok T, Pope JE. Minimally important difference for patient-reportedoutcomes in psoriatic arthritis: Health Assessment Questionnaire andpain, fatigue, and global visual analog scales. J Rheumatol 2010;37:1024–8.

45. Colangelo KJ, Pope JE, Peschken C. The minimally important differ-ence for patient reported outcomes in systemic lupus erythematosusincluding the HAQ-DI, pain, fatigue, and SF-36. J Rheumatol 2009;36:2231–7.

46. Wheaton L, Pope J. The minimally important difference for patient-reported outcomes in spondyloarthropathies including pain, fatigue,sleep, and Health Assessment Questionnaire. J Rheumatol 2010;37:816–22.

47. Sekhon S, Pope J, Baron M. The minimally important difference inclinical practice for patient-centered outcomes including health as-sessment questionnaire, fatigue, pain, sleep, global visual analogscale, and SF-36 in scleroderma. J Rheumatol 2010;37:591–8.

48. Pincus T, Summey JA, Soraci SA Jr, Wallston KA, Hummon NP.Assessment of patient satisfaction in activities of daily living using amodified Stanford Health Assessment Questionnaire. Arthritis Rheum1983;26:1346–53.

49. Blalock SJ, Sauter SV, Devellis RF. The modified Health AssessmentQuestionnaire difficulty scale: a health status measure revisited.Arthritis Care Res 1990;3:182–8.

50. Arvidson NG, Larsson A, Larsen A. Simple function tests, but not themodified HAQ, correlate with radiological joint damage in rheuma-toid arthritis. Scand J Rheumatol 2002;31:146–50.

51. Stucki G, Stucki S, Bruhlmann P, Michel BA. Ceiling effects of theHealth Assessment Questionnaire and its modified version in someambulatory rheumatoid arthritis patients. Ann Rheum Dis 1995;54:461–5.

52. Ziebland S, Fitzpatrick R, Jenkinson C, Mowat A. Comparison of twoapproaches to measuring change in health status in rheumatoidarthritis: the Health Assessment Questionnaire (HAQ) and modifiedHAQ. Ann Rheum Dis 1992;51:1202–5.

53. Katz P, for the Association of Rheumatology Health Professionals

Outcomes Measures Task Force. Measures of adult general functionalstatus. Arthritis Rheum 2003;49:S15–27.

54. Katz S, Ford AB, Moskowitz RW, Jackson BA, Jaffe MW. Studies ofillness in the aged. The Index of ADL: a standardized measure ofbiological and psychosocial function. JAMA 1963;185:914–9.

55. Asberg KH, Sonn U. The cumulative structure of personal and instru-mental ADL: a study of elderly people in a health service district.Scand J Rehabil Med 1989;21:171–7.

56. Asberg KH, Nydevik I. Early prognosis of stroke outcome by means ofKatz Index of activities of daily living. Scand J Rehabil Med 1991;23:187–91.

57. Beloosesky Y, Grinblat J, Epelboym B, Weiss A, Grosman B, Hendel D.Functional gain of hip fracture patients in different cognitive andfunctional groups. Clin Rehabil 2002;16:321–8.

58. Katz S, Akpom CA. A measure of primary sociobiological functions.Int J Health Serv 1976;6:493–508.

59. Spector WD, Katz S, Murphy JB, Fulton JP. The hierarchical relation-ship between activities of daily living and instrumental activities ofdaily living. J Chronic Dis 1987;40:481–9.

60. Rodgers W, Miller B. A comparative analysis of ADL questions insurveys of older people. J Gerontol B Psychol Sci Soc Sci 1997;52 SpecNo:21–36.

61. Reuben DB, Valle LA, Hays RD, Siu AL. Measuring physical functionin community-dwelling older persons: a comparison of self-administered, interviewer-administered, and performance-based mea-sures. J Am Geriatr Soc 1995;43:17–23.

62. LaPlante MP. The classic measure of disability in activities of dailyliving is biased by age but an expanded IADL/ADL measure is not. JGerontol B Psychol Sci Soc Sci 2010;65:720–32.

63. Cabanero-Martinez MJ, Cabrero-Garcia J, Richart-Martinez M, Munoz-Mendoza CL. The Spanish versions of the Barthel index (BI) and theKatz index (KI) of activities of daily living (ADL): a structured review.Arch Gerontol Geriatr 2009;49:e77–84.

64. Spector WD, Takada HA. Characteristics of nursing homes that affectresident outcomes. J Aging Health 1991;3:427–54.

65. Brorsson B, Asberg KH. Katz index of independence in ADL: reliabil-ity and validity in short-term care. Scand J Rehabil Med 1984;16:125–32.

66. Reijneveld SA, Spijker J, Dijkshoorn H. Katz’ ADL index assessedfunctional performance of Turkish, Moroccan, and Dutch elderly.J Clin Epidemiol 2007;60:382–8.

67. Rockwood K, Stolee P, Fox RA. Use of goal attainment scaling inmeasuring clinically important change in the frail elderly. J ClinEpidemiol 1993;46:1113–8.

68. Gresham GE, Phillips TF, Labi ML. ADL status in stroke: relativemerits of three standard indexes. Arch Phys Med Rehabil 1980;61:355–8.

69. Katz S, Downs TD, Cash HR, Grotz RC. Progress in development of theindex of ADL. Gerontologist 1970;10:20–30.

70. Keith RA, Granger CV, Hamilton BB, Sherwin FS. The functionalindependence measure: a new tool for rehabilitation. Adv Clin Reha-bil 1987;1:6–18.

71. Stineman MG, Shea JA, Jette A, Tassoni CJ, Ottenbacher KJ, Fiedler R,et al. The Functional Independence Measure: tests of scaling assump-tions, structure, and reliability across 20 diverse impairment catego-ries. Arch Phys Med Rehabil 1996;77:1101–8.

72. Mahoney FI, Barthel DW. Functional evaluation: the Barthel Index.Md State Med J 1965;14:61–5.

73. Granger CV, Markello SJ, Graham JE, Deutsch A, Reistetter TA, Otten-bacher KJ. The uniform data system for medical rehabilitation: reportof patients with lower limb joint replacement discharged from reha-bilitation programs in 2000-2007. Am J Phys Med Rehabil 2010;89:781–94.

74. Invernizzi M, Carda S, Milani P, Mattana F, Fletzer D, Iolascon G, et al.Development and validation of the Italian version of the Spinal CordIndependence Measure III. Disabil Rehabil 2010;32:1194–203.

75. Kucukdeveci AA, Yavuzer G, Elhan AH, Sonel B, Tennant A. Adap-tation of the Functional Independence Measure for use in Turkey. ClinRehabil 2001;15:311–9.

76. Ottenbacher KJ, Hsu Y, Granger CV, Fiedler RC. The reliability of thefunctional independence measure: a quantitative review. Arch PhysMed Rehabil 1996;77:1226–32.

77. Pollak N, Rheault W, Stoecker JL. Reliability and validity of the FIMfor persons aged 80 years and above from a multilevel continuing careretirement community. Arch Phys Med Rehabil 1996;77:1056–61.

78. Dodds TA, Martin DP, Stolov WC, Deyo RA. A validation of thefunctional independence measurement and its performance amongrehabilitation inpatients. Arch Phys Med Rehabil 1993;74:531–6.

79. Hsueh IP, Lin JH, Jeng JS, Hsieh CL. Comparison of the psychometriccharacteristics of the functional independence measure, 5 item Bar-thel index, and 10 item Barthel index in patients with stroke. J NeurolNeurosurg Psychiatry 2002;73:188–90.

80. Sadaria KS, Bohannon RW, Lee N, Maljanian R. Ratings of physical

S306 White et al

function obtained by interview are legitimate for patients hospitalizedafter stroke. J Stroke Cerebrovasc Dis 2001;10:79–84.

81. Young Y, Fan MY, Hebel JR, Boult C. Concurrent validity of admin-istering the functional independence measure (FIM) instrument byinterview. Am J Phys Med Rehabil 2009;88:766–70.

82. Heinemann AW, Linacre JM, Wright BD, Hamilton BB, Granger C.Relationships between impairment and physical disability as mea-sured by the functional independence measure. Arch Phys Med Re-habil 1993;74:566–73.

83. Linacre JM, Heinemann AW, Wright BD, Granger CV, Hamilton BB.The structure and stability of the Functional Independence Measure.Arch Phys Med Rehabil 1994;75:127–32.

84. Granger CV, Hamilton BB, Linacre JM, Heinemann AW, Wright BD.Performance profiles of the functional independence measure. Am JPhys Med Rehabil 1993;72:84–9.

85. Hobart JC, Lamping DL, Freeman JA, Langdon DW, McLellan DL,Greenwood RJ, et al. Evidence-based measurement: which disabilityscale for neurologic rehabilitation? Neurology 2001;57:639–44.

86. Gosman-Hedstrom G, Svensson E. Parallel reliability of the functionalindependence measure and the Barthel ADL index. Disabil Rehabil2000;22:702–15.

87. Tan WS, Heng BH, Chua KS, Chan KF. Factors predicting inpatientrehabilitation length of stay of acute stroke patients in Singapore.Arch Phys Med Rehabil 2009;90:1202–7.

88. Inouye M, Kishi K, Ikeda Y, Takada M, Katoh J, Iwahashi M, et al.Prediction of functional outcome after stroke rehabilitation. Am J PhysMed Rehabil 2000;79:513–8.

89. Heinemann AW, Linacre JM, Wright BD, Hamilton BB, Granger C.Prediction of rehabilitation outcomes with disability measures. ArchPhys Med Rehabil 1994;75:133–43.

90. Koyama T, Sako Y, Konta M, Domen K. Poststroke dischargedestination: functional independence and sociodemographic factorsin urban Japan. J Stroke Cerebrovasc Dis 2011;20:202–7.

91. Ng YS, Jung H, Tay SS, Bok CW, Chiong Y, Lim PA. Results from aprospective acute inpatient rehabilitation database: clinical character-istics and functional outcomes using the Functional IndependenceMeasure. Ann Acad Med Singapore 2007;36:3–10.

92. Black TM, Soltis T, Bartlett C. Using the Functional IndependenceMeasure instrument to predict stroke rehabilitation outcomes. RehabilNurs 1999;24:109–14, 121.

93. Lutz BJ. Determinants of discharge destination for stroke patients.Rehabil Nurs 2004;29:154–63.

94. Alexander MP. Stroke rehabilitation outcome: a potential use of pre-dictive variables to establish levels of care. Stroke 1994;25:128–34.

95. Gulati A, Yeo CJ, Cooney AD, McLean AN, Fraser MH, Allan DB.Functional outcome and discharge destination in elderly patientswith spinal cord injuries. Spinal Cord 2011;49:215–8.

96. Van der Putten JJ, Hobart JC, Freeman JA, Thompson AJ. Measuringchange in disability after inpatient rehabilitation: comparison of theresponsiveness of the Barthel index and the Functional IndependenceMeasure. J Neurol Neurosurg Psychiatry 1999;66:480–4.

97. Davidoff GN, Roth EJ, Haughton JS, Ardner MS. Cognitive dysfunctionin spinal cord injury patients: sensitivity of the Functional Indepen-dence Measure subscales vs neuropsychologic assessment. Arch PhysMed Rehabil 1990;71:326–9.

98. Hall KM, Cohen ME, Wright J, Call M, Werner P. Characteristics of theFunctional Independence Measure in traumatic spinal cord injury.Arch Phys Med Rehabil 1999;80:1471–6.

99. Kohler F, Dickson H, Redmond H, Estell J, Connolly C. Agreement offunctional independence measure item scores in patients transferredfrom one rehabilitation setting to another. Eur J Phys Rehabil Med2009;45:479–85.

100. Wallace D, Duncan PW, Lai SM. Comparison of the responsiveness ofthe Barthel Index and the motor component of the Functional Inde-pendence Measure in stroke: the impact of using different methods formeasuring responsiveness. J Clin Epidemiol 2002;55:922–8.

101. Beninato M, Gill-Body KM, Salles S, Stark PC, Black-Schaffer RM,Stein J. Determination of the minimal clinically important differencein the FIM instrument in patients with stroke. Arch Phys Med Rehabil2006;87:32–9.

102. Jette AM, McDonough CM, Ni P, Haley SM, Hambleton RK, Olarsch S,et al. A functional difficulty and functional pain instrument for hipand knee osteoarthritis. Arthritis Res Ther 2009;11:R107.

103. Jette AM, Haley SM. Contemporary measurement techniques for re-habilitation outcomes assessment. J Rehabil Med 2005;37:339–45.

104. Cella D, Gershon R, Lai JS, Choi S. The future of outcomesmeasurement: item banking, tailored short-forms, and computerizedadaptive assessment. Qual Life Res 2007;16 Suppl 1:133–41.

105. Raju NS, Price LR, Oshima TC, Nering ML. Standardized conditionalSEM: a case for conditional reliability. Appl Psychol Meas 2007;31:169–80.

Adult General Function Status Measures S307