Studying the spatial and temporal distribution of SO2 in an urban area by principal component factor...

13
Atmospheric Research, 20 (1986) 53--65 53 Elsevier Science Publishers B.V., Amsterdam -- Printed in The Netherlands STUDYING THE SPATIAL AND TEMPORAL DISTRIBUTION OF SO2 IN AN URBAN AREA BY PRINCIPAL COMPONENT FACTOR ANALYSIS M.L. SANCHEZ, I J.L. CASANOVA, 1 M.C. RAMOS 1 and J.L. SANCHEZ 2 1Departamento de F~sica Fundamental, Facultad de Ciencias, Valladolid (Spain) 2Cdtedra de Fisica,Facultad de Veterinaria,Leon (Spain) (Received June 6, 1985; accepted after revision November 29, 1985) ABSTRACT Sanchez, M.L., Casanova, J.L., Ramos, M.C. and Sanchez, J.L., 1986. Studying the spatial and temporal distribution of SO2 in an urban area by principal component factor analysis. Atmos. Res., 20: 53--65. The spatial distribution of SO2 concentration is studied as measured at nine locations in the urban area of Valladolid, using the principal component factor analysis. The three retained components are interpreted as caused by different types of sources. Also the temporal variation of SO2 concentrations has been studied at each location in relation to 21 meteorological parameters and to a persistence index of the concentrations. To avoid the use of correlated parameters the authors have at the same time applied the principal component factor analysis on 21 variables. Then the multiple linear regression was fitted between the SO2 concentrations measured at each location, and the seven retained meteorological components. In this way the interpretation of the spatial distribution could be confirmed. Re.SUM~ On a dtudid la distribution spatiale de la concentration de SO2 mesurde en 9 postes de la rdgion urbaine de Valladolid, en utilisant l'Analyse en Composantes Principales (ACP). On a interprdtd les trois composantes retenues par l'influence des diffdrentes classes de sources. En outre, la variation tempomlle de SO2 dans chacun des postes a dtd dtudide en fonction de 21 paramdtres mdtdorologiques et d'un index de persistance des concentrations. Pour dviter l'utilisation de paramdtres corrdlds on a dgalement appliqud l'Analyse en Composantes Principales ~ l'ensemble des 21 variables. On a fait ensuite une rdgression multiple lindaire entre les concentrations de SO2 mesurdes dans chaque poste et les 7 composantes mdtdorologiques obtenues. L'interprdtation de la distribution spatiale a ainsi pu ~tre confirmde. INTRODUCTION Sulphur dioxide is one of the basic pollutants whose control has become important in air pollution strategies. Due to its well-known seasonal vari- ations (Stern, 1975), and because of its multiple sources, sulphur dioxide has been a subject of intensive studies in the past few years. 0169-8095/86/$03.50 © 1986 Elsevier Science Publishers B.V.

Transcript of Studying the spatial and temporal distribution of SO2 in an urban area by principal component factor...

Atmospheric Research, 20 (1986) 53--65 53 Elsevier Science Publishers B.V., Amsterdam -- Printed in The Netherlands

STUDYING THE SPATIAL AND TEMPORAL DISTRIBUTION OF SO2 IN AN URBAN AREA BY PRINCIPAL COMPONENT FACTOR ANALYSIS

M.L. SANCHEZ, I J.L. CASANOVA, 1 M.C. RAMOS 1 and J.L. SANCHEZ 2

1Departamento de F~sica Fundamental, Facultad de Ciencias, Valladolid (Spain) 2 Cdtedra de Fisica, Facultad de Veterinaria, Leon (Spain)

(Received June 6, 1985; accepted after revision November 29, 1985)

ABSTRACT

Sanchez, M.L., Casanova, J.L., Ramos, M.C. and Sanchez, J.L., 1986. Studying the spatial and temporal distribution of SO2 in an urban area by principal component factor analysis. Atmos. Res., 20: 53--65.

The spatial distribution of SO2 concentration is studied as measured at nine locations in the urban area of Valladolid, using the principal component factor analysis. The three retained components are interpreted as caused by different types of sources. Also the temporal variation of SO2 concentrations has been studied at each location in relation to 21 meteorological parameters and to a persistence index of the concentrations. To avoid the use of correlated parameters the authors have at the same time applied the principal component factor analysis on 21 variables. Then the multiple linear regression was fitted between the SO2 concentrations measured at each location, and the seven retained meteorological components. In this way the interpretation of the spatial distribution could be confirmed.

Re.SUM~

On a dtudid la distribution spatiale de la concentration de SO2 mesurde en 9 postes de la rdgion urbaine de Valladolid, en utilisant l'Analyse en Composantes Principales (ACP). On a interprdtd les trois composantes retenues par l'influence des diffdrentes classes de sources. En outre, la variation tempomlle de SO2 dans chacun des postes a dtd dtudide en fonction de 21 paramdtres mdtdorologiques et d'un index de persistance des concentrations. Pour dviter l'utilisation de paramdtres corrdlds on a dgalement appliqud l'Analyse en Composantes Principales ~ l'ensemble des 21 variables. On a fait ensuite une rdgression multiple lindaire entre les concentrations de SO2 mesurdes dans chaque poste et les 7 composantes mdtdorologiques obtenues. L'interprdtation de la distribution spatiale a ainsi pu ~tre confirmde.

INTRODUCTION

Sulphur dioxide is one of the basic pollutants whose control has become important in air pollution strategies. Due to its well-known seasonal vari- ations (Stern, 1975), and because of its multiple sources, sulphur dioxide has been a subject o f intensive studies in the past few years.

0169-8095/86/$03.50 © 1986 Elsevier Science Publishers B.V.

54

The knowledge of the behaviour of SO 2 in an urban area ought to comply with two main requirements: to know its spatial distribution and its temporal variation. The distribution of their concentrations depends, of course, on the location of sources in the cities. The variability of SO2 concentrations is strongly influenced by the local meteorological variables and the strength of the sources.

This report at tempts to identify the basic sources and their area of impact in the city of Valladolid, and to determine whether the variability of their concentrations is related to meteorological parameters. This s tudy may be important to identify the concentrations and the nature of pollutants in order to choose the most suitable model to forecast SO2 in the city.

The preceding objectives can be solved by using multivariate techniques, since these statistical procedures are helpful in providing a summarizing description of the causes controlling the variability of the data set, Among such multivariate techniques the principal component factor analysis is considered one of the most effective approaches in the literature, and it has been used by several authors to analyse different environmental pollution problems (Adams et al., 1975; Hopke et al., 1981; Roscoe and Hopke, 1982).

In this s tudy we have applied the principal component analysis procedure (PCA) to daffy SO2 concentrations measured during the period December 1982-March 1983 in Valladolid. The three retained components grouped different monitoring samplers, and each one of them was associated with different types of sources.

In order to analyse the uncorrelated variables which cause significant variations on the SO2 concentrations, 21 meteorological variables were previously classified in groups by using, again, the principal component analysis (PCA) procedure. Therefore, the concentrations of SO2 recorded at each sampler point were fitted by a linear multiple regression on the retained components. This approach enabled us to confirm the interpre- tat ion of each one of the SO2 retained components , as well as the identifi- cation of the variables which affected the concentrations.

PRINCIPAL COMPONENT ANALYSIS

The principal component analysis is a statistical procedure which explores the data-reduction possibilities by constructing a set of new variables on the basis of the interrelations exhibited in the data set. Its objective is to transform a given set of m variables into a new set of composite variables or principal components that are orthogonal to each other.

The principal component model can be expressed as:

z~ = ~,A~p • F p ( p = 1, 2 , . . . , m ) (1)

where each one of the observed z variables is described in terms of p new components , F1, F 2 , . . . , each one of which is in turn defined as a linear

55

combination of the original variables. The Aip terms are called factor loadings and they represent the weight of each variable in each one of the components.

The key to the calculation of the principal components is the correlation matrix R (m, m) which is diagonalized. The first component associated to the larger eigenvalue is the best linear combination which explains, as far as possible, the overall variance of the system. The second component is the second linear combination, orthogonal to the first, which explains, as best as possible, the residual variance of the system, and so on.

The importance of a component may be evaluated by examining the proportion of the overall variance accounted for:

- - ( 2 ) n

where ki represents the eigenvalue of the ith component and n represents the number of variables in the set. Usually, a few of the first principal components explain a high percentage of the overall variance of the data and they characterize the information content of the data. Therefore, only p components are required to produce the z values, or, which is the same, the data set can be described in a lesser p-dimensional space.

The importance of a given component for a given variable can be expressed in terms of the variance of the variable that can be accounted for by the component, that is to say, by the square of its factor loading.

From here, it is inferred that the loading matrix A (m, p) contains all the information related to the interrelationships among the variables.

After selecting the adequate number of p components that explains a high percentage of the variance of the system, it is necessary to interpret the meaning of each component looking for the common attributes of the groups of variables included with high factor loadings in each component.

To facilitate the interpretation of the retained components it is common practice to rotate the reduced factor matrix A(m, p) in such a way as to maximize the number of values which are zero or unity, in trying to achieve what is called 'simple structure' (Harmann, 1976). There are different rotation criteria proposed in the literature to attempt to achieve it. The VARIMAX criteria retains the orthogonality of the system and it is one of the most commonly used (Gatz, 1978; Heidam, 1982; Henry et al., 1984).

PCA TO THE SO2 CONCENTRATIONS

Daffy SO2 concentrations supplied by the monitoring network of the Valladolid Health Bureau were available. During the period December- March nine monitoring samplers were operating simultaneously at the locations shown in Fig. 1. Table I shows the maximum, the minimum, the average concentrations and the standard deviations recorded by each monitoring sampler.

56

Fig. 1. Network and locat ion o f main sources in the city.

The concentrations were previously standardized by subtracting the means and dividing by the standard deviations in each of the sampler points. In Table II we show the results of the factor matrix A(m,p) after the VARIMAX rotation has been applied. The choice of the 3 first components was regarded as the best approach after the solution found for 2, 3 and 4 components was analysed and the improvements in the variance of each variable were evaluated in each case (Roscoe and Hopke, 1982). The pro- port ion of total variance accounted for by the three components was 83.8%. In the last column of Table II, the variance of each variable accounted for by the combinat ion of all components is also shown.

The first component explains 65.4% of the overall variance and it shows high factor loadings for the monitoring samplers 1, 2, 4, 5 and 9. The pollution levels of these sampler points are rather different. Thus, the

TABLE I

Maximum CMAX (pgm-3) , minimum CMIN (pgm-3) , CAVER (pgm -3 ) and standard deviations (/~gm -3) of SO2 daily concen- trations carried out during D--M winter period

57

Sampler CMAX CMIN CAVER o 2 point

1 187.0 10.0 53.7 35.3 2 297.0 10.0 76.7 59.9 3 325.0 30.0 111.1 67.6 4 332.0 13.0 84.6 62.2 5 388.0 23.0 129.8 72.6 6 446.0 44.0 118.7 52.6 7 200.0 22.0 94.0 41.1 8 271.0 14.0 90.6 50.2 9 165.0 9.0 37.2 23.8

TABLE II

Rotated factor matrix solution for SO2 concentrations

Sampler F, F 2 F3 02 point

1 0.8316 0.1588 0.1916 0.76 2 0.8219 0.2803 0.3331 0.88 3 0.3362 0.2182 0.9028 0.98 4 0.8354 0.3828 0.0609 0.85 5 0.8054 0.3555 0.1671 0.80 6 0.3422 0.8257 0.1964 0.84 7 0.1894 0.8688 0.0641 0.80 8 0.3359 0.8683 0.2162 0.91 9 0.7403 0.2493 0.3382 0.74

average concentrations recorded at sampler 5 (located in the downtown area) is the highest, while the sampler points 1 and 9 (located in the outskirts) show the lowest average concentrations. This result suggests that the first component groups all the sampler points influenced by the SO2 emissions produced by domestic heating facilities, that is to say, by discontinuous sources. In addition, looking at Fig. 1, we can see that no continuous sources are found in their immediate surroundings. So, the first component featured the domestic-heating city-wide pollution.

The second component accounts for 12.04% of the overall variance and it has high factor loadings for the monitoring samplers 6, 7 and 8. These sampler points have quite homogeneous pollution levels. The numbers 6 and 7 are located to the NE of the city, relatively close to the industrial area I2 (see Fig. 1). In addition, the University Hospital H3 adjoins to both

58

samplers and probably its continuous oil burning has influenced them. Near the monitoring sampler 8 there are no industrial sources. However, there is another large hospital, H : , whose continuous emissions can be regarded as a continuous source. Therefore, this second component marks the city's areas affected by single continuous sources when the large hospitals are included in this category.

The third component accounts for 6.4% of the overall variance, and the sampler point 3 is the only one that has a high factor loading. This sampler point is located in a typically urban environment and adjacent to the Military Hospital, H1, which is smaller than the two mentioned above. In addition a second industrial area I1, is located to the south of the city. However, since this wind direction is not one of those prevailing in Valladolid, the emissions of their sources are not expected to have an important impact on this sampler. Probably the isolated behaviour of this sampler point is due to the dual contribution of domestic heatings and the emissions from the hospital H~; in summary, it could be influenced by continuous and dis- continuous sources.

SO2 CONCENTRATIONS AND ITS DEPENDENCE ON METEOROLOGICAL VARIABLES

After classifying the behaviour of the SO2 concentrations from the point of view of its spatial distribution, our second purpose has been to deter- mine whether the variability of their concentrations is significant, related to the meteorological variables. A great deal of predictors are reported in the literature that can exert an influence on the SO2 concentrations (Benarie, 1980; Garcia, 1982). Previous practice in statistical analysis of air quality data has emphasized the necessity to find the most important variables which best predict the variability of the concentrations. The selection of the predictors on a physical basis or by statistical screening, are two common methods described in the literature (Benarie, 1980). However, both procedures can lead to instabilities in linear multiple regression fits, if the predictors are correlated. Usually it occurs that there is a strong interdependence among some of the meteorological variables, which gives places to some uncertenties. As a matter of fact, when some of them tend to rise other ones tend to rise or to fall at the same time. The application of the PCA approach to the meteorological variables selected a priori as possible predictors of the SO: concentrations, avoids this problem, since it groups the correlated variables as uncorrelated components. This approach has been used here, in order to analyse the interdependence of 21 meteorological variables which on a previous physical base, have been believed to be possible predictors of SO 2 concentrations.

After obtaining the PCA solution, we have followed the same procedure as described by Henry (Henry and Hiddy, 1979). It consists of fitting a linear multiple regression to the SO: concentrations, after considering as predictors

59

each one of the retained principal meteorological components . This method has been chosen because our objective is not only focussed on knowing what kind of variables affect the SO2 concentrations, but also on distinguishing the behaviour of the two SO2 patterns that we obtained in the previous section, from a meteorological point of view. Thus, although this procedure is not the most suitable for forecasting purposes, it can give helpful infor- mation about the main characteristics of each one of the areas affected by different kinds of sources.

PCA for meteorological variables

The list of the selected meteorological variables is shown in Table III. Daffy averages of the meteorological data, concurrent with the SO2 observations, were computed. The only exceptions to this general rule were the atmospheric pressure (PR) and the two stability atmospheric indices (ED and EDI), which were calculated at 12 G.M.T.

ED is defined as: TMAX -- Ts00 rob, where Ts00 mb means the temperature at the 500-mb level (Rosinsky et al., 1981}. This index can be considered an estimate of thermal stability.

EDI is defined as: TR--Ts00mb. According to Castejon (Naya, 1984), this index, called 'isoin', is a suitable indicator of the precipitations.

T A B L E III

List o f me teo ro log ica l var iables used in this s tudy ; all the var iables c o r r e s p o n d to dai ly averages f r o m 10 h a.m. to 10 h am., e x c e p t PR, ED and EDI wh ich are re fe r red at 1 2 h G.M.T.

C

SUM1 -- SUM2 - - SUM3 - - SUM4 - - V V M - 1 _ H H M A X HMIN T T M A X TMIN TD D U R B

B M A X BMIN P P R ED EDI

F r e q u e n c y o f ca lm per iods (0.5 m s- 1 ) F r e q u e n c y o f wind d i r ec t ions f r o m 0 ° to 90 ° F r e q u e n c y of wind d i r ec t ions f r o m 90 ° to 180 ° F r e q u e n c y of wind d i rec t ions f r o m 180 ° to 270 ° F r e q u e n c y o f w i n d d i r ec t ions f rom 270 ° to 360 °

- - M i n i m u m h u m i d i t y (%) Average t e m p e r a t u r e (vC) M a x i m u m t e m p e r a t u r e ( -C) M i n i m u m t e m p e r a t u r e (vC) Dew p o i n t (°C)

- - N u m b e r o f day l igh t h o u r s - - Average vis ibi l i ty (k in) - - M a x i m u m vis ibi l i ty (k in) - - M i n i m u m visibi l i ty (km) - - Measurable daily p r ec ip i t a t i ons ( m m ) - - A t m o s p h e r i c pressure a t 12 G.M.T. ( rob)

• . . O T h e r m a l s t a b i h t y index a t 12 h G.M.T. ( C )

. . . . O

I som s t a b i h t y i ndex a t 1 2 h G.M.T. ( C )

Inverse o f wind speed (m s- 1 )

Average h u m i d i t y (%) - - M a x i m u m h u m i d i t y (%)

60

The PCA a p p r o a c h fo r the 21 me teo ro log ica l variables led to t h e resul ts s h o w n in Table IV. We have re ta ined the f irst 7 c o m p o n e n t s which a c c o u n t e d fo r 86.4% of the overall var iance . F o r the sake of s impl ic i ty all the f a c to r loadings which were u n d e r 0 .35 have been wr i t t en zero in Tab le IV.

TABLE IV

Rotated factor matrix solution for 21 meteorological variables

Var. F1 F2 F3 F4 F5 F6 F7 o 2

C 0.00 0.00 0:91 0.00 0.00 0.00 0.00 0.89 SUM1 0.00 0.00 0.00 0.00 0.00 0.00 0.81 0.71 SUM2 0.00 0;00 0.00 0.00 0.95 0.00 0.00 0.92 SUM3 0.00 0.00 -- 0.69 0.00 0.00 0.00 -- 0.57 0.87 SUM4 0.00 0.00 0.00 0.00 0.00 0.84 0.00 0.79 VVM- 1 0.00 0.00 0.88 0.00 0.00 0.00 0.00 0.86 H -- 0.92 0.00 0.00 0.00 0.00 0.00 0.00 0.91 HMAX - - 0.66 0.00 0.00 0.00 0.00 0.00 0.00 0.72 HMIN - - 0.81 0.00 0.00 0.00 0.00 0.00 0.00 0.79 T 0.00 0.96 0.00 0.00 0.00 0.00 0.00 0.99 TMAX 0.00 0.86 0.00 0.00 0.00 0.00 0.00 0.94 TMIN 0.00 0.88 0.00 0.00 0.00 0.00 0.00 0.93 TD 0.00 0.96 0.00 0.00 0.00 0.00 0.00 0.64 DUR - - 0.44 - 0.35 0.00 0.00 0.00 0.00 0.00 0.64 BMAX 0.83 0.00 0.36 0.00 0.00 0.00 0.00 0.89 BMAX 0.80 0.00 0.00 0.00 0.00 0.00 0.00 0.79 BMIN 0.61 0.00 0.38 0.00 0.00 0.00 0.00 0.72 P 0.00 0.00 0.00 0.53 0.37 0.00 0.00 0.51 PR 0.00 0.00 0.00 -- 0.75 0.00 0.00 0.00 0.71 ED 0.43 0.40 0.00 0.63 0.00 0.00 0.00 0.87 EDI 0.00 0.42 0.00 0.82 0.00 0.00 0.00 0.90

The first c o m p o n e n t explains 29 .7% o f the overal l var iance and it has high loadings fo r the var iables re la ted to the h u m i d i t y and visibility. The b ipolar signs reveal t h a t w h e n the h u m i d i t y increases, the visibil i ty decreases and vice versa. We will call this c o m p o n e n t ' h u m i d i t y c o m p o n e n t ' .

The second c o m p o n e n t explains 18.9% of the overall var iance and it p resen t s high posi t ive loadings fo r all the t empera tu re s . We will call this c o m p o n e n t ' t e m p e r a t u r e c o m p o n e n t ' .

The thi rd c o m p o n e n t expla ins 15.5% of the overal l var iance and presen ts high pos i t ive loadings fo r the inverse of the wind speed, f r e q u e n c y of dai ly ca lms and m o d e r a t e loadings fo r the th i rd sec tor wind direct ions . The first t w o variables have the same sign, which means t ha t w h e n the dai ly wind speed (VVM) tends to rise, the f r e q u e n c y of daffy calms (C) tends to fall. T h e m o d e r a t e negat ive loading for SW di rec t ions is under s t andab le , since this sec tor con ta ins the wind d i rec t ions wi th the highest wind speeds. Thus, t he th i rd c o m p o n e n t clusters the variables re la ted to the 'mechan ica l dis- pers ion o f po l lu t an t s ' .

61

The fourth component explains 5.9% of the overall variance and it has positive loadings for the two atmospheric stability indices and the measur- able precipitations. In addition, the atmospheric pressure is included with a negative sign. This component is related to the 'atmospheric stability', since it indicates the expected tendency towards the decrease of the atmos- pheric pressure when it rains, or, what is the same, when the atmospheric stability described by ED or EDI decreases.

The last three components explain only 5.5%, 4.9% and 4.1% of the overall variance and they include the wind directions from the second, fourth, and first sectors, respectively. The moderate loadings for the wind directions f rom the third and first sectors are comprehensible since they correspond to the prevailing winds in Valladolid. Moreover, their opposite signs are in agreement with their 'anticoincident ' directions.

We should point out that the seasonal variable, number of daylight hours DUR, was the only one that remained undefined, since it spreads over the two first components. This variable would probably be aligned in some component if the period of observations had been longer.

Multiple linear regression to the S02 concentrations

As was ment ioned before, the SO2 concentrations were fitted by a linear multiple regression, by using as uncorrelated variables the retained principal components. However, since using too many variables lead to instabilities in multiple linear regression fits, as a first step, we have eliminated the three last components related to the wind direction. Moreover, in order to be consistent, the wind direction from the third sector SW, was also removed. The omission of these variables is based on a previous analysis of the relation between the wind direction and the concentrations. The calculation of the average concentrations in the four sectors in which the compass was divided, did not reveal, in general terms, any significant difference for any specific wind direction. Furthermore, to analyse the influence of the synoptic atmospheric stability indices mentioned before, the atmospheric pressure and the precipitations were also removed from the fourth component . However, a new variable was taken into account: the concentration of the previous day, CA. This variable accounts for the concentration of pollutant- mass existing in the area at the beginning of the observed day, and it can be considered as a suitable index of the atmospheric persistence, or in other words, of the atmosphere 's ability to maintain the pollution levels. The importance of this variable is reported in the literature (Bolzern et al., 1982).

Thus, the concentrations of SO2 recorded as each sampler point, has been fi t ted using as predictors: the concentrations of the previous day, CA, and the first four meteorological components with the outlines pointed out. The F test was used to determine the statistical significance (95%) of each one of the predictors considered.

The partial correlation coefficient obtained in each of the monitoring

62

samplers is shown in Table V. The symbol NS means that this specific predictor was not significant statistically. In the last column are shown all the variances explained for the predictors at each sampler point.

In the first column of Table V are shown the results related to the influence of the concentrations of the previous day, CA, (in of the con- centrations were computed}. This variable was significant statistically in all the sampler points. The percentage of the variance explained by this pre- dictor varies from 22.9% to 68.3% for sampler points 5 and 8, respectively. It can be seen that the concentrations of the previous day represent, on average, about 50% of the observed SO2 concentrations. However, it may be noticed that sampler points 7 and 8 have a correlation coefficient which is almost twice the coefficient of the remaining samplers. These results mean that the contribution of the concentrations of the previous day in the samplers influenced by continuous sources is more important than in the samplers affected by the emissions of discontinuous sources.

T A B L E V

M u l t i p l e l i nea r r e g r e s s i o n c o e f f i c i e n t s fo r SO2 c o n c e n t r a t i o n s o n t h e 4 m e t e o r o l o g i c a l c o m p o n e n t s a n d t h e c o n c e n t r a t i o n o f t h e p r e v i o u s d a y , C A , in e a c h s a m p l e r p o i n t

S a m p l e r C A F~ F2 F3 F4 o 2 ( ln) ( - - H + B) (T) ( V V M -1 ÷ C) ( E D ÷ E D I ) (%)

1 0 . 4 1 NS - - 0 . 0 1 5 -- 0 . 2 1 - - 0 . 0 3 3 6 4 . 6 0 2 0 . 3 8 NS - - 0 . 0 1 8 NS NS 4 5 . 6 0 3 0 . 4 2 NS - - 0 . 0 0 8 NS - - 0 . 0 2 0 4 3 . 0 0 4 0 . 4 3 NS - - 0 . 0 1 9 NS NS 4 5 . 7 1 5 0 . 3 4 NS - - 0 . 0 1 3 NS - - 0 . 0 1 9 4 5 . 3 1 6 0 . 5 9 NS - - 0 . 0 0 7 NS - - 0 . 0 1 2 6 5 . 1 1 7 0 . 7 5 NS - - 0 . 0 0 4 NS NS 7 0 . 4 3 8 0 . 7 8 NS - - 0 . 0 0 7 NS NS 7 3 . 1 2 9 0 . 5 5 - - 0 . 0 0 4 - - 0 . 0 1 2 NS NS 5 6 . 5 4

In the third column of Table V we can see the results of the partial correlation coefficients for the meteorological component related to the temperature, and it has been significant in all the sampler points too. This variable explains a percentage of the variance which varies from 3.7% to 25.9% for samplers 7 and 4, respectively. The temperature has a decisive influence on the pollutants whose main sources are domestic heatings, because a decrease of the temperature gives rise to an increase in the emissions. This fact justifies the general negative sign we have obtained for all the sampler points. It may however be noticed that the samplers 1, 2, 4, 5 and 9 have a coefficient about two times higher than t h e other samplers 6, 7 and 8. For example, sampler point 7, which is the nearest to the industrial area I2, has the lowest coefficient, while number 4, located in the downtown area, has the highest one. Again the influence of the tem-

63

perature emphasizes the difference between the areas affected by con- t inuous and discontinuous sources.

The features of monitoring sampler 3, fall between those of the previous groups of samplers. Taking into account the influence of the concentration of the previous day, it behaves in a similar manner as the samplers affected by discontinuous sources, while analysing the dependence on temperature, its coefficient is quite similar to that of the group of samplers influenced by cont inuous sources. This result seems to confirm the double apportion of the two kinds of sources, as was pointed out before.

The second column, related to the 'humidity component ' , has only been significant for monitoring sampler 9, and it explains the rather small variance percentage, 5.8%. The general conclusion is that neither the humidity nor the visibility influences the concentrations. The isolated behaviour of sampler 9, could be ascribed to the local topographical features, since this point is located on a low hill.

The four th column, related to the 'dispersion mechanical of pollutants ' , has only been significant for monitoring sampler 1. This result, although unexpected, has already been mentioned in the literature (Benarie, 1980), and in our case it could be due to the low wind speed that we had during the period studied. The average value for VVM was only 1.58 m s-1, which reveals an almost total lack of atmospheric ability to disperse pollutants.

The only exception to this general behaviour was found for sampler 1. The negative partial correlation coefficient indicates that the concentrations increase when the wind speed increases or calm conditions decrease. The foregoing suggests that this sampler point receives SO 2 from distant sources. In order to know the influence of the wind directions, their frequencies were also considered as predictors, after removing the variables which were not significant. This s tudy revealed the impact of two wind directions: from the second and third sectors. Some industrial sources located in the industrial area I and possibly an old alcohol factory (marked as AF in Fig. 1) sited 300 m away in the four th sector could contribute, with their emissions, to this sampler point. However, these remarks must be accepted with reservations, since the data set was not very large.

The fifth column, related to the 'synoptic atmospheric stability', has been significant for sampler points 1, 3, 5 and 6. The negative partial corre- lation coefficients reveal the expected increase of the SO2 concentrations when the atmospheric stability decreases, or which is the same, when the atmospheric pressure increases, or vice versa (see Table IV). However, the influence of this component has been small. On the one hand it affects only some samplers, and on the other hand it explains small percentages of the variability of the concentrations, since its contr ibution was never higher than a 10% in any sampler point.

64

CONCLUSIONS

The application of the multivariate principle component analysis has resulted in a helpful technique to provide a meaningful description of two systems of data.

(1) Three components have described the spatial distribution of the SO2 concentrations recorded at nine sampler points located in an urban area. Each one of them represented areas influenced by continuous sources -- when large hospitals are included -- and discontinuous sources.

(2) Seven uncorrelated components have described the interdependence of the 21 meteorological variables. Each one of them clustered correlated variables related to: humidity, temperature, mechanical ability to disperse pollutants, atmospheric stability and wind directions from the 90 ° sectors.

The variability of the SO2 concentrations related to the meteorological variables, has been studied fitting the concentrations of SO2 recorded in each sampler point, to the meteorological components. This method has provided additional information about the behaviour of the two main SO 2 areas. The temperature and the concentration of the previous day were the significant variables which explained the main part of the variability of the SO2 concentrations in all the sampler points. After comparing the influence of the concentration of the previous day and the temperature, we can summarize that the sampler points influenced by continuous sources are characterized by a strong persistence of the concentrations and a weak dependence on the temperature; contrarily, the discontinuous pattern affected by domestic heatings is strongly dependent on the temperature while the persistance of the concentrations plays a less important role.

Thus a double application of the PCA approach to the daily SO2 con- centrations at 9 sampler points and to 21 external variables has been sum- marized in terms of two different patterns in which the temperature and the concentrations of the previous day explain the main part of the variability of this pollutant in Valladolid.

This preliminary study may be useful as a starting point for future fore- casting of SO2 concentrations. For example, since the concentrations of the previous day play an important role, the choice of autoregressive models is regarded as one of the most suitable ones. Moreover, these kinds of models could be applied to each one of the main areas in which Valladolid has been divided: the NE of the city influenced by single sources, and the remaining area which is basically affected by domestic heatings.

REFERENCES

Adams, F., Van Craem, M. and Van Espen, P., 1975. The elemental composition of atmospheric aerosol particles at Chaealtaya, Bolivia. Atmos. Environ., 14: 879--893.

Benarie, M.M., 1980. Urban Air Pollution Modelling. The McMillan Press, London.

65

Bolzern, P., Finci, G., Fronza, G. and Spirito, A., 1982. Armax stochastic models of air pollution: three cases studies. Proc. October 1979 IIASA Workshop, Pergamon Press, London, pp. 185--193.

Gatz. D.F., 1978. Identification of aerosol sources in the S. Louis area using factor analysis. J. Appl. Meteorol., 17: 600--618.

Garcia, R., 1982. Modelos estoc£sticos para la prediccidn de contaminantes y variables meteoroldgicas. Tesis Doctoral, Universidad Complutense, Madrid.

Harman, H.H., 1976. Modern Factor Analysis. The University of Chicago Press, Chicago. Heidam, N.Z., 1982. Atmospheric aerosol, factor models, mass and missing data. Atmos.

Environ., 16:1923--1931. Henry, R.C. and Hiddy, G.M., 1979. Multivariate analysis of particulate sulphate and

other quality variables by principal component, Part I. Annual data from Los Angeles and New York. Atmos. Environ., 13: 1581--1596.

Henry, R.C., Lewis, C.W., Hopke, P.K. and Williamson, H.J., 1984. Review of receptor model fundamentals. Atmos. Environ., 18:1507--1515.

Hopke, P. and Macias, E., 1981. Atmospheric Aerosol Source/Air Quality Relationships. American Chemical Society. ACS Symp., Series 167.

Naya, A., 1984. Meteorologia Pr~ctica. Espasa Calpe, Madrid. Roscoe, B.A. and Hopke, P.K., 1982. The use of component factor analysis to interpret

particulate compositional data sets. J. Air Pollut. Control Assoc., 32: 637--642. Rosinsky, J., Morgan, G., Weickmann, P. and Lecinski, A., 1981. A study of population

of ice-forming nuclei in New Mexico, U.S.A., and its possible dependence on meteoro- logical processes. Meteorol. Runsch., 34 :

Stem, A., 1975. Air Pollution. Academic Press, New York, San Francisco.