Measuring clinical nurse performance: development of the King's Nurse Performance Scale

9
Pergamon PII: SOO20-7489(97)00009-6 ht. .I. Nurs. Stud, Vol. 34, No. 3, pp. 222-230, 1997 0 1997 Elsevier Smnce Ltd. All rights reserved Printed in Great Britam 0020-74X9/97 1617.00+0.00 Measuring clinical nurse performance: developmentof the King’s Nurse Performance Scale Joanne M. Fitzpatrick, Alison E. While and Julia D. Roberts Department of Nursing Studies, King’s College London, Cornwall House, Waterloo Road, London SE1 8WA, U.K. (Received 20 September 1996;revised 3 December 1996;accepted 14 January 1997) Abstract The development of the King’s Nurse Performance Scale to measure clinical nurse performance is described. Instrument construction was informed by the Slater Nursing Competencies Rating Scale [Wandelt, M. A. and Stewart, D. S. (1975) Slater Nursing Competencies Rating Scale. Appleton-Century Crofts, New York] together with key literature and the use of expert opinion. The instrument was utilised to observe the clinical performance of senior student nurses (n = 99) and data which were at the ordinal level were statistically analysed using a variety of non-parametric tests. Key findings of students’ observed nursing practice are presented in a separate paper (While et al., unpublished document). Internal consistency testing of the King’s Nurse Performance Scaleusing Cronbach’s alpha coefficient revealed a promising alpha for the total instrument (r = 0.93). The subsection alphas indicated that further refinement may enhance the strength of the instrument as a tool for the measurement of performance in different domains of practice. The possible useof the Scale in the professional development of newly qualified nurses is suggested.0 1997Elsevier ScienceLtd. Keywords: Clinical performance; measurement; instrument development. Background This research formed part of a larger English National Board (ENB) commissioned study to com- pare the outcomes of pre-registration nurse education programmes in the United Kingdom (While et al., 1995). At the time of the study, three different pro- grammeswere available: registered general nurse pro- grammes, diploma RN programmes, and integrated degreeprogrammes. Since pre-registration education aims to equip stu- dents with sound knowledge and clinical skills, and with attitudes and values favourable to the pro- fessional nurse role (ENB, 1994), it is essential to investigate how nurses actually practice in the clinical environment (While, 1994). One approach is direct observation of nurses’ practice, however, this method has only been utilised to a limited extent in nurse education research. The present research sought to address this gap by exploring the performance of senior student nurses from the three programmes using the King’s Nurse Performance Scaleduring non- participant observation in the ward setting. Observing actual-situated behaviour Direct observation has been advocated as the research method of choice when information is required regarding how people behavein their natural environment, whether and how they actually use their skills, or the events which occur in the course of nor- mal activities (Crow, 1984).However, methodological challenges associated with observation research, for example, related to effective instrument development and operationalisation, are well acknowledged in the literature (Fitzpatrick et al., 1996) and may partly account for its limited use to date. A further com- pounding issue concerns the fact that: “development of a tool, implementation and abandonment occurs too frequently” (Wood, 1982). Acknowledging these issuesit was decided to draw upon the strengths of an existing instrument and of particular interest to the researchers was the Slater Nursing Competencies Rating Scale (Wandelt and Stewart, 1975). This is a generic tool which focuses observation upon nurse performance as a whole and has been tested for reliability and validity (Ager and Wandelt, 1975).Generic instruments are those which: “contain scalesand are designed to assess or measure the quality of nursing care in general, rather than nursing care associated with specific problems” (Tom- alin et al., 1992). The Slater Nursing Competencies Rating Scale (Wandelt and Stewart, 1975) consists of 84 observable items which have been arranged into six subsections: ‘psychosocial individual’ (18 items); ‘psychosocial 222

Transcript of Measuring clinical nurse performance: development of the King's Nurse Performance Scale

Pergamon

PII: SOO20-7489(97)00009-6

ht. .I. Nurs. Stud, Vol. 34, No. 3, pp. 222-230, 1997 0 1997 Elsevier Smnce Ltd. All rights reserved

Printed in Great Britam 0020-74X9/97 1617.00+0.00

Measuring clinical nurse performance: development of the King’s Nurse Performance Scale

Joanne M. Fitzpatrick, Alison E. While and Julia D. Roberts

Department of Nursing Studies, King’s College London, Cornwall House, Waterloo Road, London SE1 8WA, U.K.

(Received 20 September 1996; revised 3 December 1996; accepted 14 January 1997)

Abstract

The development of the King’s Nurse Performance Scale to measure clinical nurse performance is described. Instrument construction was informed by the Slater Nursing Competencies Rating Scale [Wandelt, M. A. and Stewart, D. S. (1975) Slater Nursing Competencies Rating Scale. Appleton-Century Crofts, New York] together with key literature and the use of expert opinion. The instrument was utilised to observe the clinical performance of senior student nurses (n = 99) and data which were at the ordinal level were statistically analysed using a variety of non-parametric tests. Key findings of students’ observed nursing practice are presented in a separate paper (While et al., unpublished document). Internal consistency testing of the King’s Nurse Performance Scale using Cronbach’s alpha coefficient revealed a promising alpha for the total instrument (r = 0.93). The subsection alphas indicated that further refinement may enhance the strength of the instrument as a tool for the measurement of performance in different domains of practice. The possible use of the Scale in the professional development of newly qualified nurses is suggested. 0 1997 Elsevier Science Ltd.

Keywords: Clinical performance; measurement; instrument development.

Background

This research formed part of a larger English National Board (ENB) commissioned study to com- pare the outcomes of pre-registration nurse education programmes in the United Kingdom (While et al., 1995). At the time of the study, three different pro- grammes were available: registered general nurse pro- grammes, diploma RN programmes, and integrated degree programmes.

Since pre-registration education aims to equip stu- dents with sound knowledge and clinical skills, and with attitudes and values favourable to the pro- fessional nurse role (ENB, 1994), it is essential to investigate how nurses actually practice in the clinical environment (While, 1994). One approach is direct observation of nurses’ practice, however, this method has only been utilised to a limited extent in nurse education research. The present research sought to address this gap by exploring the performance of senior student nurses from the three programmes using the King’s Nurse Performance Scale during non- participant observation in the ward setting.

Observing actual-situated behaviour

Direct observation has been advocated as the research method of choice when information is

required regarding how people behave in their natural environment, whether and how they actually use their skills, or the events which occur in the course of nor- mal activities (Crow, 1984). However, methodological challenges associated with observation research, for example, related to effective instrument development and operationalisation, are well acknowledged in the literature (Fitzpatrick et al., 1996) and may partly account for its limited use to date. A further com- pounding issue concerns the fact that: “development of a tool, implementation and abandonment occurs too frequently” (Wood, 1982).

Acknowledging these issues it was decided to draw upon the strengths of an existing instrument and of particular interest to the researchers was the Slater Nursing Competencies Rating Scale (Wandelt and Stewart, 1975). This is a generic tool which focuses observation upon nurse performance as a whole and has been tested for reliability and validity (Ager and Wandelt, 1975). Generic instruments are those which: “contain scales and are designed to assess or measure the quality of nursing care in general, rather than nursing care associated with specific problems” (Tom- alin et al., 1992).

The Slater Nursing Competencies Rating Scale (Wandelt and Stewart, 1975) consists of 84 observable items which have been arranged into six subsections: ‘psychosocial individual’ (18 items); ‘psychosocial

222

J. M. Fitzpatrick et a/./Clinical nurse performance 223

group’ (13 items); ‘physical’ (13 items); ‘general’ (16 items); ‘communication’ (7 items); and ‘professional implications’ (17 items). Reliability testing has been conducted (using inter-rater reliability, stability and internal consistency tests) and construct, content, pre- dictive and discriminant validity have been examined by Ager and Wandelt (1975).

For example, as an index of inter-rater reliability interclass correlation coefficients were selected by Ager and Wandelt (1975) and were calculated using the scores of pairs of observers who had rated the performance of three student groups (n = 74) sim- ultaneously but independently. Values of 0.72, 0.75 and 0.78 were achieved, indicating a modest reliability when compared to the criterion of 0.80 which Nunally (1978) has specified. As a measure of the instrument’s internal consistency the odd-even split half reliability and Cronbach’s alpha techniques were employed. The odd&even split half technique produced a reliability coefficient of r = 0.98 which compares favourably with Nunally’s (1978) criterion of 0.80. Cronbach’s alpha technique, however, yielded a coefficient of r = 0.74 for the total instrument using 71 of the 84 items (13 items had inadequate sample sizes). Coefficients for the six subsections have not been reported by Ager and Wandelt. The variation between the two coefficient measures may be explained by the possibility of different reliability estimates being obtained when different combinations of splits have been used (Polit and Hungler, 1987). Further, Ager and Wandelt (1975) have suggested that the Cron- bath’s alpha measure in this instance may be under- estimated due to unequal sample sizes for the various intercorrelations (p. 55). To explore the instrument’s underlying dimensions, factor analysis was conduc- ted, using 7 1 of the 84 items which had sufficient cases. Factors with an eigenvalue > 1 were retained and 12 in total were identified accounting for 83% of the total variance, with factor 1 accounting for 55%. The authors reported that on the varimax rotation items from the six subsections demonstrated some tendency to load on separate factors, which indicates that items were not exclusive to their subsections. Further details regarding where the factors drew their items from have not been reported by Ager and Wandelt.

The original version of the Slater Nursing Com- petencies Rating Scale (Slater, 1967) was used in a pilot study by Christman (1971) to examine bacca- laureate nurses’ performance (n = 42) using the organisation of nursing care as the independent vari- able. In another United States (U.S.) study, Petti (1975) utilised the tool to obtain head nurse and pati- ent ratings of nurse performance. Unfortunately, neither Christman (1971) nor Petti (1975) reported having carried out further reliability and validity test- ing of the Scale. Interestingly, the quality assessment instrument Qualpacs (Wandelt and Ager, 1974) which was also developed from the Slater Nursing Competencies Rating Scale, has been found to be the most valid of three popular generic quality assessment

instruments in use in the United Kingdom (Redfern et al., 1994).

Enhancing content specificity

The King’s Nurse Performance Scale drew sub- stantially upon the Slater Nursing Competencies Rat- ing Scale (Wandelt and Stewart, 1975). The latter, however, was not without its limitations, for example, it was developed for use in the US over twenty years previously. It was essential therefore to ensure, as far as possible, that content was specific to and rep- resented a current understanding of nurse per- formance in the United Kingdom (Anastasi, 1976; Nunally, 1978; Cronbach, 1984; Linn et al., 1991). Indeed, it has been asserted that failure to establish this requirement: “increases the probability that reliability estimates will simply describe the con- sistency of irrelevant measures of subject effec- tiveness” (Peterson et al., 1985). Thus, while acknowledging that an index of content validity can- not be computed, it was nevertheless imperative to maximise quality of content.

In previous attempts to design performance evalu- ation tools, a variety of strategies to generate and validate the content domain have been adopted, including: incorporating current literature; drawing upon programme curricula; seeking expert opinion; and preliminary observation of people engaged in the activity under consideration to identify key indicators (Sommerfield and Accola, 1978; Stecchi et a/., 1983; Cottrell et al., 1986; Gould, 1993). The Critical Inci- dent Technique has also been explored as a potentially useful approach to generate a valid content base (Gor- ham, 1962; Gorham, 1963; Brumback and Howell, 1972; Sims, 1976; DeBack and Mentowski, 1986). For example, DeBack and Mentowski (1986) interviewed staff nurses and nurse supervisors (n = 83) from three health care settings to elicit critical incidents of effec- tive and ineffective nursing practice. A model of nurse competencies was developed from the interview data and was subsequently used to score perceived differ- ences in the performance of baccalaureate (n = 37) associate degree (n = 8) and diploma (n = 30) nurses, as well as nurses with a higher degree (n = 5). Once again, however, the instrument was utilised to explore perceived differences rather than actual-situated per- formance. Strategies adopted to enhance the content specificity of the King’s Nurse Performance Scale are discussed in the forthcoming sections.

The construction of scale items

Taking into consideration the above information, the King’s Nurse Performance Scale was constructed with the aim of producing a generic set of observable nursing actions reflective of nurse performance in the United Kingdom (UK) and amenable to accurate dis- crimination. An analysis of the concept of nurse per- formance identified key facets of the nurse role which

224 J. M. Fitzpatrick et aL/Clinical nurse performance

informed the development of the instrument, together with a critical review of the Slater Nursing Com- petencies Rating Scale (Wandelt and Stewart, 1975).

Within the King’s Nurse Performance Scale, seven domains of nurse performance were formulated and these are detailed below. Items generated from the literature as well as items drawn from the Slater Nurs- ing Competencies Rating Scale (Wandelt and Stewart, 1975) were assembled to represent each area of prac- tice and the first draft of the instrument consisted of 67 items grouped into the seven domains of nurse performance. Items were illustrated with cues, the pur- pose of which was to facilitate the observer training programme, specifically, enabling the accurate identi- fication and discrimination of items during obser- vation and minimising observer inference. For example, the cues provided for item 4 ‘Ensures patient receives fluid intake as appropriate’ were as follows: ‘Ensures intravenous fluid is administered according to regimen’; and ‘Acts upon the evidence of a main- tained intake and output chart’. To further facilitate quick and accurate recording in the field setting, each item was identified as one whose rating will usually be direct, one whose rating will usually be indirect (e.g. from a written record) or one whose rating may be direct or indirect. The first subsection ‘physical domain’ (14 items) focused upon nursing actions to address the physical needs of clients. The ‘psycho- social domain’ (6 items) referred to nursing actions which addressed the psychosocial needs of clients. The ‘professional domain’ (9 items) centred upon actions directed towards fulfilling the professional role. Meet- ing the knowledge needs of clients, self and others was the focus of the ‘promotion of health and teaching skills domain’ which consisted of four items. Man- agement of self and others was the focus of the ‘care management skills and organisation of workload domain’ and was represented by six items. Com- munication with clients and others was reflected in subsection six which consisted of five items. The final subsection encompassed the use of the nursing process approach to the planning and delivery of care (21 items). The subsections were designed to reflect the training regulations of pre-registration courses of nurse preparation (Statutory Instrument No. 1456, 1989) the Code of Professional Conduct (UKCC, 1992) and criteria for effective nurse performance which were derived from a wide literature base. For example, it is expected that the newly qualified nurse should be able to devise, implement and evaluate a plan of care (Statutory Instrument No. 1456, 1989; UKCC, 1992). The conceptual framework under- pinning this, the nursing process, was therefore reflected in the subsection: ‘use of the nursing process in planning care’ and the work of several key authors contributed to item generation (Mayers, 1978; Yura and Walsh, 1978; Kratz, 1979; Brooking, 1986; Hunt and Marks-Maran, 1986).

Some of the challenges associated with the process of instrument development included avoiding rep-

etition and the inclusion of non-specific items. Analy- sis ofitems in the Slater Nursing Competencies Rating Scale informed the developmental process in this study. For example, items one and two of the ‘psy- chosocial: individual’ subsection of the Slater instru- ment highlighted the difficulty of generating discrete items. Specifically, item 2 refers to the nurse being a receptive listener, however, this overlaps with item 1, the focus of which is the need to give full attention to the patient. No similar items were incorporated in the King’s Nurse Performance Scale. Further challenges included ensuring item mutual exclusiveness and logi- cal coherence, as well as achieving discrete subsec- tions. For example, associated with attending to clients’ personal hygiene needs (physical domain) is working in collaboration (use of the nursing process in planning care domain), and attending to clients’ sensitivities (psychosocial domain). This challenge is highlighted further when examination of the internal consistency and construct validity of the instrument is discussed.

Validation of content

Key criteria for maximising content validity include an adequate collection of items which represent the domain of investigation (i.e. nurse performance) and appropriate methods of test construction (Nunally, 1978; Messick, 1989; Streiner and Norman, 1989). From the outset of instrument development, the objective was to select items which provided the most accurate and representative description of effective nurse performance and this process was informed by an extensive review of the literature. The use of experts is also an accepted strategy to validate the content domain of instruments (Anastasi, 1976; Cronbach, 1984; Streiner and Norman, 1989) and a panel of nine experts drawn from clinical, educational and research settings, together with the Steering Group of the larger study, were involved in the process of instrument review.

The King’s Nurse Performance Scale items and cues were sent to each member of the expert panel and the Steering Group of the larger study who were asked to respond independently. Experts were asked for their comments upon the Scale as a whole, and the appro- priateness of subsections and individual items. The experts were asked to review the instrument for clarity, comprehensiveness, mutual exclusiveness and to sug- gest any additional items for inclusion. They were also asked to score each item as an observable indicator of effective nurse performance and to assign a rating on a scale of O-5 (0 being totally irrelevant and 5 being absolutely essential). It emerged, however, that the scoring system did not always correspond with com- ments regarding the technical quality of items. Thus while an item may have been considered an important indicator of effective nurse performance its con- struction required modification. Further, in some cases experts chose to note comments rather than

J. M. Fitzpatrick et uI/Clinical nurse performance 225

assign ratings to items, Thus refinement or deletion of items was made on the basis of consistently critical comments.

Items not explicitly related to the content domain may introduce measurement error since potentially they may discriminate among participants on a dimen- sion which is different from that which the researcher intends to investigate (Streiner and Norman, 1989). In view of this, one important function of the expert review system was the identification of any erroneous items. Distribution of the Scale to the expert panels on the first occasion resulted in the deletion of 10 items: ‘physical domain’ (two items), ‘psychosocial domain’ (one item); ‘professional domain’ (three items); and ‘use of the nursing process in planning care domain’ (five items). Other modifications included item refinement to prevent replication. For example. items 42 and 43 in the ‘physical domain’ of the Slater Nursing Competencies Rating Scale were reflected in the first draft of the King’s Nurse Per- formance Scale in a modified format. Item 42 read: ‘Recognises hazards to patient safety and takes appro- priate action to maintain a safe environment and to give patient a feeling of being safe’ and item 43 read: ‘Carries out safety measures developed to prevent pat- ients from harming themselves or others’. The concept of safety was acknowledged as vital and was therefore reflected in the following item: ‘Acts to maintain a safe environment for patient/others’ in the instrument. Other modifications, to facilitate quick and accurate recording in the field setting, included verbalising and identifying each item as one whose rating will usually be direct, one whose rating will usually be indirect, (e.g. from a written nursing record) or one whose rating may be direct or indirect. Some adjustments to item cues were also made on the basis of panel review. The process of reduction and refinement resulted in a second draft of the instrument which consisted of 54 items, grouped into seven subsections,

The second draft of the instrument was distributed to each member of the expert panels for independent review. This resulted in some further minor refinement of item wording on the basis of consistent critical comments. For example: ‘Gives verbal/written evi- dence of insight into patient’s deeper problems/needs’ was modified to read: ‘Gives verbal/written evidence of insight into patient’s psychosocial needs/problems’. Deletion of one item from: ‘use of the nursing process in planning care domain’ was made on the basis of potential overlap producing a 53-item instrument (see Fig. 1 for exemplar of Scale domain). The Scale was redistributed to the experts on a third occasion, however, no further refinements were suggested.

Identifying rating criteria

The next stage of developmental work focused upon identification of appropriate criteria to score the Scale items during the observation process and several alter- natives were considered, The Slater Nursing Com-

Fig. 1

(0 vmk. Rtq.mlk d E&N. ,592)

Exemplar of Scale domain.

petencies Rating Scale, for example, like many clinical evaluation instruments (Moritz and Sexton, 1970; Gennaro et al., 1982; Bond and Jackson, 1990) incor- porates the use of a rating scale to score the items. Purported advantages of this format include its ability to direct observation towards specific and clearly focused aspects of behaviour, thereby providing a con- venient method for recording observer judgements (Gronlund, 1981; Polit and Hungler, 1987). Further. the capacity to have a range of rating scale points has the potential to convey more meaningful information about the quality of performance and discriminate more accurately between groups (Bondy, 1983).

The use of ratings, as Guilford (1954) has empha- sised. rests upon the assumption that: “the human observer is a good instrument of quantitative obser- vation, that he is capable of some degree of precision and some degree of objectivity” (p. 278). One strategy to enhance accuracy of ratings is by defining rating scale points. It is known, for example, that inadequate or omitted explication of rating scale points compro- mises the reliability and validity of this format (DeMers, 1978; Atwood, 1980; Horn, 1980; Gron- lurid, 1981; Popham, 1981; Bondy, 1983). This is partly owing to the difficulties of interpretation associ- ated with a variety of formats (e.g. qualitative labels and numerical labels) where criteria are not identified for the scale labels. Moritz and Sexton (1970) for example, used normative labels to score student prac- tice on a five-point rating scale (superior, above aver- age, average. needs improvement and unsatisfactory).

226 J. M. Fitzpatrick et u/./Clinical nurse performance

In their discussion regarding instrument admin- istration Moritz and Sexton concluded that agreement was never reached on use of the ‘average’ label and they recognised the advantages of clear definitions. Similar problems coupled with the desire to perform a fair and objective evaluation have stimulated the development or refinement of instruments by several researchers (Gennaro et al., 1982; Bond and Jackson, 1990).

Similar challenges are associated with the use of an individual or general frame of reference to measure subject performance. For example, items in the Slater Nursing Competencies Rating Scale (Wandelt and Stewart, 1975) are scored using a rating scale and observers adopt an individual or general frame of reference to operationalise a standard of measure- ment. Using an individual frame of reference, observers using the Slater Nursing Competencies Rat- ing Scale are required to identify exemplars of nurses on a five-point scale: ‘best, between best and average, average, between average and poorest and poorest’ against which performance is rated. Alternatively, a general frame of reference may be adopted to rate performance using qualitative labels such as ‘excellent, above average, average, below average and poor’. Undefined or vague definitions, however, may increase the potential of observer error owing to interpretation difficulties (Gronlund, 198 1; Popham, 1981; Bondy, 1983). Thus it may be argued that Wan- delt and Stewart’s (1975) individual or general frame of reference has a professional base which may be idiosyncratic. In an attempt to minimise such prob- lems careful definition of the criteria by which to judge behaviour was considered paramount.

Of the alternative rating criteria reviewed, Bondy’s (1983) five-point criterion-referenced rating scale appeared to be the most robust and was adopted for use in the present study. Bondy (1983) developed her rating scale in an attempt to avoid the pitfalls associ- ated with the process of student clinical performance evaluation. She suggested five levels of performance, namely: independent; supervised; assisted; marginal; and dependent, together with a ‘not observed’ cate- gory. The ‘not observed’ category was not included in this study and instead an ‘omitted care’ category was created since Redfern et al. (1993) have argued with reference to Qualpacs (Wandelt and Ager, 1974) that there is a potentially informative distinction between ‘poorest care’ and ‘omitted care’. During observation Bondy’s (1983) levels of performance were considered under three key categories. The first category, pro- fessional standards and procedures for the behaviour, encompasses the issues of safety (for clients, self and others), accuracy (incorporates the application of research-based knowledge to practice), effect (achiev- ing the intended purpose of the behaviour) and affect (the manner in which the behaviour is performed). The focus of the second category is the qualitative aspects of performance and includes the use of time, space, equipment and energy. Additionally, dem-

onstration of persistence under adverse circumstances was incorporated into this category since it was con- sidered to be an important contributor and dis- tinguisher between high and low quality nursing care (Fordham, 1991, personal communication). The final category addresses the type and degree of assistance required to carry out the nursing activity.

The positive effect of Bondy’s (1983) criterion-ref- erenced rating scale on accuracy and reliability of ratings has been demonstrated in an experimental study in which videotapes depicting nursing activities (e.g. drug administration, nurse-client interview pro- cedure and a dressing technique) were produced to reflect the five levels of performance (Bondy, 1984). An observation schedule was also constructed, con- sisting of items reflecting the cognitive (five items), affective (five items) and psychomotor (four items) domains which were rated using Bondy’s five-point scale. The scale points were numerically labelled (5,4, 3, 2, 1 and X for not observed) for the control group and I, S, A, M, D and X (not observed) for the exper- imental group. Explanatory information about the study and the 14-item schedule was presented to both groups. Only the experimental group, however, received an explanation of the rating criteria. Results indicted that using the rating criteria enhanced accu- racy in the rating process and as student performance improved the beneficial effect of using the criteria became more pronounced. Further, there was evi- dence of discrimination using the five-point scale. Bondy’s (1983) criterion-referenced rating scale was therefore judged a potentially sensitive method to rate the Scale items during observation of nurse perform- ance.

Extraneous variables influencing clinical performance

It is impossible to examine nurse performance with- out taking account of potentially influencing factors. Human performance is influenced by a variety of extrinsic and intrinsic variables (Fitzpatrick et al., 1996) and taking these into consideration, contextual data which reflected the ward environment at the time of observation were recorded for each session. Further, participants were observed continuously for a period of 2’;* hr on three separate occasions at different times of the day in an effort to accommodate the possible influence of any such variations.

Additional refinements

Further refinements were made to the Scale as a result of pilot work which was undertaken over a three-month period and involved observation of final year diploma students (n = 7) and nursing degree stu- dents (n = 5) drawn from a location separate from the main study. Ethical approval was obtained for all participating institutions and informed consent was gained from all participants. As a result of this phase of the research minor adjustments were made to Scale

J. M. Fitzpatrick et al./Clinical nurse performance 221

items and Bondy’s (1983) rating criteria were modified (Fitzpatrick et al., 1996). No item reduction occurred. Internal consistency testing of the tool and principal components analysis were not conducted at this stage due to the small data set. Examination of inter- observer reliability and observer drift are discussed in a separate paper which details operationalisation of the instrument (Fitzpatrick et al., 1996).

Reliability and validity testing of the instrument

As highlighted previously, the target population for this research were students completing their nursing programmes for Part 1 (RGN) and Part 12 (Adult Nursing) of the Register. Non-participant observation of senior students representing the three programmes (n = 99) took place in the hospital setting and each participant was observed on three separate occasions at different times of the day, totalling 742.5 hr of observation. Data analysis, which commenced on completion of data collection, involved calculating a total mean performance score and a mean score for each of the seven domains for each participant. The formula used was: the sum of weighted totals divided by the total number of ratings. The sum of the total weighted scores reflected the performance levels: inde- pendent (4) assisted (3) marginal (2) and dependent (1). The omitted column was not assigned a numerical weighting but was reflected in the total number of ratings for each observation period. Non-parametric tests were applied to explore relationships between the different participant groups and key findings are presented elsewhere (While et al., 1996). The data were used to conduct reliability and validity testing of the instrument, details of which are presented below.

Internal consistency testing of the instrument

The data were used to examine the internal con- sistency of the King’s Nurse Performance Scale using

Table 1 Content and percentage of variance for each component

Cronbach’s alpha technique. The latter provides a good estimate of reliability in most situations since the major source of measurement error focuses upon sampling of content (Nunally, 1978). The total Scale yielded an alpha of Y = 0.93 using the means of the subsection scores (n = 98, 1 missing case due to insufficient data for analysis). Since the Scale com- prised of seven subsections it was also important to examine the internal consistency of each (Waltz et al., 1991). Using the item means it was only possible to calculate an alpha coefficient for 3 of the 7 subsections (where >50 cases were included). The coefficient alphas were: ‘physical domain’ Y = 0.74 (n = 51); ‘pro- fessional domain’ Y = 0.70 (n = 60); and ‘promotion of health and teaching skills domain’ r = 0.71 (n = 61). The subsection alphas did not reach Nunally’s (1978) criterion of 0.80 which suggests independent components within the instrument. Interestingly, insufficient score data in the ‘psycho- social: group’ section for Qualpacs (Wandelt and Ager, 1974) ratings was a significant problem in the research of both Carr-Hill et al. (1992) and Redfern et al. (1994).

An alternative method was also used to compute coefficient alphas for the subsections, that is, using the mean score for each observation period. Using this method it was possible to compute coefficient alphas for the seven subsections, however, once again they did not reach the criterion of 0.80 (Nunally, 1978): ‘physical domain’ Y = 0.58 (n = 99); ‘psychosocial domain’ Y = 0.53 (n = 62); ‘professional domain’ r = 0.60 (n = 99); ‘promotion of health and teaching skills domain’ Y = 0.46 (n = 93); ‘care management skills and organisation of workload domain’ r = 0.60 (n = 99); ‘communication skills domain’ r = 0.65 (n = 99); and ‘use of the nursing process in planning care domain’ r = 0.72 (n = 99).

In summary, internal consistency testing suggested some independence within the subsections of the instrument. There was evidence nonetheless of overall

Component Description % of Variance

I Provides for the psychological, social and physical needs of patients using a multidisciplinary approach to planning and delivery of care

2 Self-directing care management and organisation of workload 3 Attention to patient safety with adherence to regulations. policy directives and

research findings 4 Attention to patients’ physical care needs for hygiene with sensitivity and

encouraging patient participation in care 5 Attention to patients’ activity in accordance with their current and potential

health status 6 Safe administration of intravenous/parenteral fluids with patient consultation

and teaching I Effective communication about patient supported by patient participation in the

evaluative process 8 Attention to patients’ dietary intake appropriate to associated actual/potential

problems

43.5

9.3 1.4

5.2

4.6

4.2

3.x

3.2

228 J. M. Fitzpatrick et &./Clinical nurse performance

Table 2 Number of items within each component

Component Number of items

1 15 (11. 14, 16, 21-23, 27-29, 32-35, 38, 41) 2 14 (4, 7, 11-12, 20-21,23,27-28, 32, 37-38,

41,51) 3 11 (&9, 12, 16, 19, 27,40,4748) 4 9 (24, 16, 22, 38, 41, 47, 51) 5 8 (1, 11-12, 20, 24, 29, 40, 51 6 7 (6, 21-22, 24, 28, 32, 47) 7 4 (16, 29, 37, 51) 8 3 (5,20,40)

coherence and the total alpha coefficient of Y = 0.93 is superior to that published for the Slater Nursing Competencies Rating Scale of r = 0.74, as reported by Ager and Wandelt (1975). Unfortunately, coefficients for the 6 subsections of the Slater Nursing Competencies Rating Scale have not been reported by Ager and Wandelt.

Examination of the instrument’s construct validity

Principal components analysis was conducted to explore the instrument’s underlying dimensions (Kim and Mueller, 1978; Dunteman, 1991) and 33 items had sufficient cases to be included in the analysis. Eight components with eigenvalues > I were extracted from the rotated component matrix with the first prin- cipal component explaining 43.5% of the total vari- ance and the second accounting for just under 10%. Interpretation was based on part loadings with values >0.30. The contents of the components and their percentage of variance have been summarised in Table 1. Similar to findings regarding the internal con- sistency of the Slater Nursing Competencies Rating Scale (Ager and Wandelt, 1975) and Qualpacs instru- ment (Fox and Ventura, 1984) items in the King’s Nurse Performance Scale demonstrated some tend-

Table 3 Items loading on first principal component

ency to load on separate factors (Table 2). The 14 items which loaded on component 1 are set out in Table 3 and it was noteworthy that the items were drawn from six of the seven subsections. Component 1 appears to focus upon two inter-related areas of nursing: (i) an individual and holistic plan of care including skilled communication, performing effec- tively and responsibly in the planning and delivery of care with a multidisciplinary approach (derived from 10 items); and (ii) the psychosocial component of nurs- ing care and in particular developing and maintaining a therapeutic nurseslient relationship (derived from four items).

Examination of convergent validity

As well as observation of practice, senior students’ performance in a care planning simulation exercise was examined as part of the larger study (While et al., 1995). These data were used to examine convergent validity and using Spearman’s rank correlation coefficient test, the results showed a modest cor- relation between the total score for observed practice and the global score for the care plan (v = 0.185, PcO.05). Statistically significant associations also emerged between: a higher observation score for use of the nursing process in planning care in the ward setting and a higher global care plan score (Y = 0.18 1, P<O.O5); a higher score for observed practice in the psychosocial domain and a higher score for the psy- chosocial domain in the care plan (Y = 0.227, PcO.025); a higher total score for observed practice and a higher score for problem identification in the care plan (r = 0.192, P < 0.05). It is possible that these findings reflect the complexity of the link between nurses’ performance in the clinical environment and their care planning. Thus, further methods refinement and testing may enhance convergent validity, however, it is also possible that the tools measure different aspects of nurse performance which may in part explain the modest convergent validity.

Item Description Loading

14 Attends to or helps appropriately the distressed/emotional state of the patient 0.83 35 Establishes rapport with patient/family/significant other 0.82 34 Spends time with patient as appropriate 0.81 27 Provides information in a comprehensible way to patient/significant other/staff 0.71 33 Contributes as nurse member of multi-disciplinary team caring for patient 0.69 22 Is reliable; seeks guidance/help when necessary 0.63 32 Is a constructive team member and leader where appropriate 0.62 16 Gives verbal/written evidence of insight into patient’s psychosocial needs/problems 0.55 38 Communicates clearly in speech about patient 0.50 21 Cares for assigned patients; knows where and how they are 0.46 29 Adapts care to patient’s physical and mental abilities 0.40 II Acts to provide relief for the physically distressed patient 0.38 28 Distributes her/his time appropriately between her/his allocated patients 0.37 23 Is self-directing; takes initiative 0.31

J. M. Fitzpatrick et al./Clinical nurse performance 229

Conclusion

The development of the King’s Nurse Performance Scale was informed by the Slater Nursing Com- petencies Rating Scale (Wandelt and Stewart, 1975) together with key literature and expert opinion. The Scale is able to measure performance in the different domains of nursing practice, however, its particular strength lies in its ability to measure overall nurse performance in the clinical setting as demonstrated by the favourable Cronbach’s alpha coefficient (v = 0.93) of the total instrument. The subsection alphas indi- cated that further refinement may improve the strength of the instrument as a tool for the measure- ment of performance in different domains of practice.

Further refinement of the observation method is therefore required in order to enhance both the val- idity and reliability of this method as a means of measuring actual-situated nurse performance. Never- theless, the King’s Nurse Performance Scale is an empirically based generic tool with a content domain specific to nurse performance in the UK which permits detailed examination of nurses’ practice of care deliv- ery thus enabling the identification of strengths and weaknesses in nurse clinical performance which could be utihsed in the professional development of newly registered nurses. However, it is acknowledged that this 53 item Scale demands significant non-participant observation skills which will require a major training programme before its competent use by nurses in clini- cal practice.

Acknowledgements

The authors were engaged in a comparative study of outcomes of pre-registration nurse education pro- grammes commissioned by the English National Board for Nursing, Midwifery and Health Visiting. This article draws upon this work. Responsibility for the views expressed, issues of interpretation questions of inclusion and omission, remain as always with the research team and do not necessarily reflect the views of the English National Board for Nursing, Midwifery and Health Visiting.

References

Ager, J. W. and Wandelt, M. A. (1975) Tests of the Scale. In Sluter Nursing Competencies Rating Scale, ed. M. A. Wandelt and D. S. Stewart. Appleton- Century Crofts, New York.

Anastasi, A. (I 976) Psychological Testing. Macmillan Publishing Co., New York.

Atwood, J. R. (1980) A research perspective. Nursing Reseurch 29(2), 104108.

Bond, M. L. and Jackson, E. (1990) Maternal-infant clinical nurse specialist performance assessment: development of an evaluation tool. Clinical Nurse Specialist 4(4), 180-I 86.

Bondy, K. N. (1983) Criterion-referenced definitions for rating scales in clinical evaluation. Journul crf Nursing Education 22(9), 3766382.

Bondy, K. N. (1984) Clinical evaluation of student performance: the effects of criteria on accuracy and reliability. Research in Nursing and Health 7,25-33.

Brooking, J. I. (1986) Patient and Family Par- ticipation in Nursing Care: the Development of a Nursing Process Measuring Scale. Unpublished PhD Thesis, University of London.

Brumback, G. B. and Howell, M. A. (1972) Rating the clinical effectiveness of employed physicians. Journal of Applied Psychology 56(3), 241-244.

Carr-Hill, R., Dixon, P., Gibbs, I., Griffiths, M.. McCoughan, D. and Wright, K. (1992) Skill Mix and the LYYffictit~eness of Nursing Care. Centre of Health Economics, University of York.

Christman, N. J. (1971) Clinical performance of baccalaureate graduates. Nursing Outlook 19(l), 5456.

Cottrell, B. H., Cox, B. H., Kelsey, S. J., Ritchie, P. J., Rumph, E. A. and Shannahan, M. K. (1986) A clinical evaluation tool for nursing students based on the nursing process. Journal qf Nursing Edu- cation 25(7), 270-274.

Cronbach, L. J. (1984) Essentials of Psychological Testing, 4th Edn. Harper and Row Publishers, New York.

Crow, R. (1984) Observation. In The Research Process in Nursing, ed. D. F. S. Cormack, pp. 90-91. Blackwell Scientific Publications, Oxford.

DeMers, J. L. (1978) Observational Assessment of Performance. In Evaluating Competence in the Health Professions, ed. M. K. Morgan and D. Irby. C. V. Mosby, St Louis, MO.

DeBack, V. and Mentowski, M. (1986) Does the Baccalaureate make a difference: differentiating nurse performance by education and experience. Journal of’hbrsing Education 25(7), 275-285.

Dunteman, G. H. (199 1) Principal Components An+ sis. Sage Publications, Newbury Park, California.

English National Board for Nursing, Midwifery and Health Visiting (ENB) (1994) Creuting Lifelong Learners-Purtnerships,for Care. Guidelines for Pre- Registration Nursing Programmes qf’ Educcttion. ENB, London.

Fitzpatrick, J. M., While, A. E. and Roberts, J. D. (1996) Operationalisation of an observation instru- ment to explore nurse performance. international Journal ofNursing Studies 33(4), 3499360.

Fordham, M. (199 1) Personal Communication. Fox, R. N. and Ventura, M. R. (1984) Internal

psychometric characteristics of the quality patient care scale. Nursing Research 33(2), 112-I 17.

Gennaro, S., Thielen, P., Chapman, N., Martin. J. and Barnett. D. C. (1982) The birth life and times of a clinical evaluation tool. Nurse Edmutor vii(l), 27732.

Gorham, W. A. (1962) Staff nursing behaviours con- tributing to patient care and improvement. Nursing Research 11, 68-79.

Gorham, W. A. (1963) Methods for measuring staff nursing performance. Nursing Research 12(l), 4-l 1.

Gould, D. (1993) Knowledge, Opinions and Practice of Essential Infection Control Measures: a Com- parative Study of Nurses in Different Clinical

230 J. M. Fitzpatrick et &/Clinical nurse performance

Settings. Unpublished PhD Thesis, University of London.

Gronlund, N. E. (1981) Measurement and Evaluation in Teaching. Macmillan Publishing Co., New York.

Guilford, J. P. (1954) Psychometric Methods, 2nd edn. McGraw-Hill, New York.

Horn, B. J. (1980) Establishing valid and reliable criteria. Nursing Research 29(2), 88-90.

Hunt, J. M. and Marks-Maran, D. J. (1986) Nursing Care Plans: the Nursing Process at Work. John Wiley and Sons, Chichester.

Kim, J. 0. and Mueller, C. W. (1978) Factor Analysis: Statistical Methods and Practical Issues. Sage Pub- lications, Beverly Hills, CA.

Kratz, C. R. (ed.) (1979) The Nursing Process. Bailliere Tindall, London.

Linn, R. L., Baker, E. L. and Dunbar, S. B. (1991) Complex performance-based assessment: expec- tations and validation criteria. Educational Researcher 20(8), 15-2 1.

Mayers, M. G. (1978) A Systematic Approach to the Nursing Care Plan. Appleton-Century-Crofts, New York.

Messick, S. (1989) Validity. In Educational Measure- ment, ed. R. L. Linn, 3rd edn. American Council on Education, Macmillan, Phoenix.

Moritz, D. A. and Sexton, D. L. (1970) Evaluation: a suggested method for appraising quality. Journal of Nursing Education 9(l), 17-34.

Nunally, J. C. (1978) Psychometric Theory. McGraw- Hill, New York.

Peterson, D., Micceri, T. and Smith, 0. (1985) Measurement of teacher performance: a study in instrument development. Teaching and Teacher Education l(l), 63-77.

Petti, E. R. (1975) A Study of the Relationship between the 3 Levels of Nursing Education on Nurse Competency as Rated by Patient and Head Nurse. Unpublished Doctoral Thesis, Boston Uni- versity.

Polit, D. F. and Hungler, B. P. (1987) Nursing Research: Principles and Methods, 3rd edn. J. B. Lippincott Co., Philadelphia.

Popham, W. J. (ed) (1981) Modern Educational Measurement. Prentice Hall, Englewood Cliffs, New Jersey.

Redfern, S. J., Norman, I. J. with Tomalin, D. A., Oliver, S. and Jacka, K. (1994) The Validity ofQual- ity Assessment Instruments in Nursing: Final Report to the Department of Health. Nursing Research Unit, King’s College London.

Redfern, S. J., Norman, I. J., Tomalin, D. A. and Oliver, S. (1993) Assessing quality of nursing care. Quality in Health Care 2, 124128.

Sims, A. (1976) The Critical Incident Technique in

evaluating student nurse performance. International Journal of Nursing Studies 13, 123-I 30.

Slater, D. S. (1967) Slater Nursing Competencies Rat- ing Scale. College of Nursing, Wayne State Univer- sity, Detroit.

Sommerfield, D. P. and Accola, K. M. (1978) Eva- luating students’ performance. Nursing Outlook 26, 432436.

Statutory Instrument No. 1456 (1989) The Nurses, Midwives and Health Visitors (Registered Fever Nurses Amendment Rules and Training Amendment Rules) Approval Order. HMSO, London.

Stecchi, J. M., Woltman, S. J., Wall-Haasc, C., Heg- gestad, B. and Zier, M. (1983) A comprehensive approach to clinical evaluation: one teaching team’s solution to clinical evaluation of students in mul- tiple settings. Journal of Nursing Education 22(l), 3846.

Streiner, D. L. and Norman, G. R. (1989) Health Measurement Scales: a Practical Guide to their Development and Use. Oxford University Press, New York.

Tomalin, D. A., Redfern, S. J. and Norman, I. J. (1992) Monitor and Senior Monitor: some prob- lems of administration and some proposed solu- tions. Journal of Advanced Nursing 17,72-82.

United Kingdom Central Council (UKCC) (1992) Code of Professional Conduct ,for the Nurse, Midwife, and Health Visitor, 3rd edn. UKCC, London.

Waltz, C. F., Strickland, 0. L. and Lenz, E. R. (199 1) Measurement in Nursing Research, 2nd edn. F. A. Davis Co., Philadelphia.

Wandelt, M. A. and Ager, J. W. (1974) Quality Patient Care Scale. Appleton-Century-Crofts, New York.

Wandelt, M. A. and Stewart, D. S. (1975) Slater Nurs- ing Competencies Rating Scale. Appleton-Century Crofts, New York.

While, A. E. (1994) Competence versus performance: which is the more important? Journal of Advanced Nursing 20, 525-53 1.

While, A. E., Roberts, J. D. and Fitzpatrick, J. M. (1995) A Comparative Study of Outcomes of Pre- Registration Nurse Education Programmes. ENB, London.

While, A. E., Fitzpatrick, J. M. and Roberts, J. D. (1996) A comparison of outcomes from different pre-registration nurse education courses, Unpub- lished document.

Wood, V. (1982) Evaluation of student clinical per- formance: a continuing problem. International Nursing Review 29(l), 1 l-18.

Yura, H. and Walsh, M. B. (1978) The Nursing Pro- cess-Assessing, Planning, Implementing and Eva- luating. Appleton-Century-Crofts, New York.