AUTOMATED MOTION DETECTION FROM SPACE IN SEA SURVEILLANCE

AUTOMATED MOTION DETECTION FROM SPACE IN SEA

SURVEILLIANCE

Elisavet Charalambous(a)(b), Junichi Takaku(c),Pantelis Michalis (d), Ian Dowman(e), Vasiliki

Charalampopoulou(f)

(a) ADITESS Ltd , Nicosia, Cyprus, (b) Department of Electrical and Computing Engineering,

University of Cyprus, Nicosia, Cyprus, (c) Remote Sensing Technology Center of Japan, (d) Center

for Security Studies (KEMEA), (e) Department of Civil, Environmental and Geomatic Engineering,

University College London, (f) Geosystems Hellas

ABSTRACT

The Panchromatic Remote-sensing Instrument for Stereo Mapping (PRISM) carried by the Advanced Land-Observing

Satellite (ALOS) was designed to generate worldwide topographic data with its high-resolution and stereoscopic

observation. PRISM performs along-track (AT) triplet stereo observations using independent forward (FWD), nadir

(NDR), and backward (BWD) panchromatic optical line sensors of 2.5m ground resolution in swaths 35 km wide. The

FWD and BWD sensors are arranged at an inclination of ±23.8◦ from NDR.

In this paper, PRISM images are used under a new perspective, in security domain for sea surveillance, based on the

sequence of the triplet which is acquired in a time interval of 90 sec (45 sec between images). An automated motion

detection algorithm is developed allowing the combination of encompassed information at each instant and therefore the

identification of patterns and trajectories of moving objects on sea; including the extraction of geometric characteristics

along with the speed of movement and direction. The developed methodology combines well established image

segmentation and morphological operation techniques for the detection of objects. Each object in the scene is represented

by dimensionless measure properties and maintained in a database to allow the generation of trajectories as these arise

over time, while the location of moving objects is updated based on the result of neighbourhood calculations.

Most importantly, the developed methodology can be deployed in any air borne (optionally piloted) sensor system with

along the track stereo capability enabling the provision of near real time automatic detection of targets; a task that cannot

be achieved with satellite imagery due to the very intermittent coverage.

Keywords: ALOS, PRISM, stereo mapping, satellite images, image processing, automated motion detection, sea

surveillance

1. INTRODUCTION

Earth observation satellites capture an area on Earth periodically according to their orbit. Humans are good at deriving

information from such images, because of our innate visual and mental abilities[1]. In image processing a range of

algorithms have already been developed in order to extract specific (based on the application) geospatial information for

the earth itself along with the position, size, shape and interrelationships of man-made objects on its surface. Thus, the

satellite image is transferred to meaningful geospatial information in various formats (raster, vector and matrix).

In this paper, the ALOS PRISM triplet along track stereo panchromatic images are used as the source for the automatic

detection of vessels’ motion on the sea. The test areas are two large commercial ports in Cyprus (Larnaca, Limassol). As

a first step, the PRISM images should be geometrically registered to each other before applying the automated motion-

detection-algorithms. This procedure is important as the improvement of the relative position increase the accuracy and

the precision of the automatic motion detection algorithm. Since this is our first attempt in performing automatic motion

detection with satellite images, we should note that the selected methods were chosen in ways that would limit the bias

(even if that would come against the effectiveness of the outcome). The automatic detection of patterns within the images

is done by making use of edge detection techniques along with the use of several pattern descriptors following by a

number of dimensionless measures.

The results of this study are very promising based on the fact that the large majority of moving vessels are detected, from

large ships to very small boats (as long as they are captured in the image). Moreover, the motion of a vessel is identified

even at very low speeds of movement (the lowest could be less than 0.7 miles / hour). Surprisingly, during the

implementation of the procedure an airplane in its way to land to Larnaca airport is detected, too. Thus it seems that the

developed methodology could be a generic one.

In this paper, the importance of along track stereoscopic view with optical sensors is presented in automatic motion

detection procedures, along with a very effective methodology to extract the position and the velocity of moving vessels

on the sea. In general terms, the developed methodology could be used in any satellite or air borne (optionally piloted)

sensor system with along the track stereo capability, which could provide even real time automatic detection of targets.

Thus, an integrated solution could be established starting from a Remote Piloted Aircraft System (RPAS) capable of

carrying an along track sensor payload, which can process on board the above mentioned methodology and send

information on the moving vessels in near real time to a control center.

2. ALOS/PRISM

Advanced Land Observing Satellite (ALOS) was launched on January 2006, and operated very well until May 2011.

Though the mission life of the satellite ended years ago approximately 6.5 million scenes of archived data covering the

entire globe are still available to users [2]. Panchromatic Remote-sensing Instrument for Stereo Mapping (PRISM), one

of onboard sensors carried by ALOS, was designed to generate worldwide topographic data with its optical stereoscopic

observation. The sensor consists of three independent panchromatic radiometers for viewing forward (FWD), nadir

(NDR), and backward (BWD) in 2.5m ground resolution producing a triplet stereoscopic image along its track of 35km

swath. The FWD and BWD sensors are arranged at an inclination of ±23.8◦ from NDR, so that a time interval of the

acquisitions for the triplet images will be 90 seconds (45 seconds between images). This specific imaging mechanism

can be utilized not only for creating the topographic data but also for detecting moving objects during the time intervals.

a. Test data set

We selected four datasets of PRISM triplet images which observed the two site-areas in Cyprus.

Table 1 shows the basic information of the datasets

Table 1. Basic information of the PRISM triplet image datasets.

Area Scene ID (NDR) Date Scene center lat./lon. (NDR)

Larnaca ALPSMN231682900 2010/05/31 34.804/ 33.657

ALPSMN271942900 2011/03/03 34.806/ 33.637

Limassol ALPSMN254292905 2010/11/02 34.557/ 33.040

ALPSMN267712905 2011/02/02 34.559/ 33.035

b. Registration of the stereo images

The PRISM triplet stereo images should be geometrically registered with each other before applying the automated

motion-detection-algorithms. The relative orientation technique, which is being used in the triangulation for creating the

topographic data, is utilized in this process [3]. In the relative orientation tie-points among the triplet images of level-1B1

(i.e., geometrically not corrected) are automatically generated first. Approximately 100 tie-points were generated for

each image triplet. Figure 1 shows the distribution of 100 tie-points generated on one of the NDR images, and one of the

tie-points generated on each image.

Figure 1. Distribution of 100 tie-points on one of NDR images (a), and one of the tie-points generated on each image: (b)

NDR, (c) FWD, and (d) BWD.

Next the tie-points are used to estimate errors of orientation parameters for FWD/BWD images while the ones for NDR

image are fixed as a reference. A viewing vector, which corresponds to each tie-point in image space, is calculated in

object space for each triplet image by using the sensor model of PRISM and orbit/attitude data of the satellite. And then

the errors of orientation parameters (i.e., attitude errors) of FWD/BWD sensors are determined with the conventional

least square method so that the RMSE of x-parallaxes (i.e., cross-track parallaxes) of the viewing vectors between FWD

and NDR and between BWD and NDR will be minimized. The symmetry of y-parallaxes (i.e., along-track parallaxes)

between FWD and NDR and between BWD and NDR is also considered in the estimation. Table 2 shows the residual

RMSE of x-parallaxes for the tie-points on each image triplet. The RMSEs are from 0.76m to 1.50m and are

corresponding to the relative registration accuracy of each image triplet. The absolute errors of the relative orientation

models remain almost in vertical offset due to instability of pitch angles of the FWD/BWD sensors; they are corrected

with a tie-point which is generated on an area where the absolute height is known such as at sea level (i.e., height = 0).

Table 2: Residual RMSE of x-parallaxes for the tie-points on each image triplet (in meters).

Scene ID (NDR) No. of tie-points FWD – NDR BWD – NDR

ALPSMN231682900 98 1.404 1.005

(a) (b)

(c) (d)

ALPSMN271942900 100 0.935 1.078

ALPSMN254292905 100 0.756 1.006

ALPSMN267712905 100 1.502 1.063

Finally the triplet images of level-1B1 are projected onto the geoid surface (i.e., mean sea level) in UTM coordinates

with the orientation models so that they will only include y-parallaxes which correspond to the orthometric height on

the ground. Therefore, the objects on the sea are basically registered geometrically on the images without any post-

processing.

3. BACKGROUND

The motion or traffic detection based on satellite data has already been examined by the research community and it is

based on high resolution Satellite images (Worldview) or SAR images (TERRASAR).

In the WorldView case, the traffic monitoring or the car motion detection is based on specific characteristics of

WorldView acquisition. WorldView-2’s focal plane consists of one PAN and two MS imaging sensors (MS1: Blue,

Green, Red, Near-IR1 and MS2: Coastal Blue, Yellow, Red Edge, Near-IR2), with a very small lag in data recording

between each spectral band, few hundreds of milliseconds, causes imperceptible displacements of moving objects in the

image space.

In paper [4] the use of MS1 and MS2 (time t1) spectral bands for traffic detection due to the high spectral contrast

between roads and vehicles in this specific wavelengths is investigated. In this case, the delay in data recording between

Coastal Blue and Blue has been estimated in only 316ms [5], thus, an object with a speed of 30 km/h (18.6 mph) will

show a ground displacement of 2.6 m between the image sensed at time t0 and time t1=t0+316ms. In terms of image

space, the ground displacement corresponds to about 5 image pixels in the PAN and less than 2 image pixel in the MS

channels. The automated motion detection procedure for moving cars in paper [4] seems to be very effective based on the

results where the accuracy is better than 90%.

However, for sea surveillance the Worldview image acquisition process could not provide any help, based on the fact

that the time is very small to allow detection of low speed moving vessels. Moreover, ALOS PRISM image are in the

same waveband while Worldview images are in different spectral bands, thus additional pre-processing need to be done.

In the TERRASAR case, the Dual Receive Antenna (DRA) or Aperture Switching (AS) mode [6] enables the reception

of two SAR images of the same scene within a small temporal gap, which can be utilized for interferometric detection

approaches. Since moving objects suffer from special effects in the SAR processing algorithm, specific methods to detect

vehicles are required [7]. Thus, in order to detect moving vehicles we should derive simultaneously the effects in SAR

images, which are caused by the vehicle’s motion in across and along-track.

However, for sea surveillance as in a Worldview case the image acquisition is not useful, based on the fact that the time

is very small to allow detection of low speed moving vessels. As a conclusion the main tool that can provide the data for

automated motion detection could be a sensor system based on the same architecture as ALOS PRISM. This means that

along track image acquisition with at least one sensor in oblique view is an adequate requirement.

4. METHODOLOGY

Satellite images are essentially raster information where the cell (pixel) size determines the resolution at which the data

is represented. The depiction of pictorial information with resolution dependent media introduces the benefit that the

location of each point in the image is straightforwardly characterized by its coordinates in the grid of pixels, but also the

issue that the resolution imposes limits on the level of approximation (during digitization) of every element. Satellite

images commonly capture the objects of interest from fairly large distances and as a consequence large areas fall within

in the field of view; artifacts, in the result image, may not be accurately represented. Moreover the study of consecutive

images – obtained at regular time intervals – makes this task even more challenging as it involves the extraction and

identification of under-sampled artifacts. The automatic detection of patterns within the images is done by making use of

well-established edge detection techniques [8] while pattern identification is accomplished with the use of several pattern

descriptors [9]; among them a number of dimensionless measures [10]. The combination and further analysis of the

results obtained by the individual snapshots then allows the automatic motion detection of the overlapping scenes [11].

Each pixel of ALOS PRISM is translated into an area of 6.25m2; each pixel side corresponds to a distance of 2.5m.

According to the Nyquist criterion, a sampling interval equal to twice the highest specimen spatial frequency is required

in order to accurately preserve its spatial resolution in the resulting digital image [8]. An equivalent measure is Shannon's

sampling theorem, which states that the digitizing device must utilize a sampling interval that is no greater than one-half

the size of the smallest resolvable feature of the optical image. Therefore, to capture the smallest degree of detail present

in a specimen, sampling frequency must be sufficient so that two samples are collected for each feature, guaranteeing

that both light and dark portions of the spatial period are gathered by the imaging device [10]. With regards to the

analyzed sample, a pixel detail level of 2.5m may only adequately depict artifacts larger than 5m. Even though, the

automatic motion detection of ships does not require their image representation at the finest level it is evident that this

limitation may easily cause problems; especially for small sized ships or ships close to each other or even close to the

coastline. Due to this, the applied image processing techniques for pattern recognition are selected considering this

limitation while still having computational performance in mind.

Image segmentation was accomplished by applying global thresholding to each image and serves as a computationally

efficient method even for very large images. Thresholding is a method very well known in the field which partitions the

image histogram by using a single global threshold. Segmentation is then accomplished by scanning the image pixel by

pixel and labeling each pixel as object or background [8]. Over the years this method has evolved producing methods for

optimal and adaptive thresholding, despite this the most basic implementation has been adopted as it limits the bias over

any triplet of images. The resultant binary image then became subject to binary dilation with a diamond structuring

element of 5 neighbours which allowed filling one pixel gaps between extracted edges. In binary morphology, dilation is

a shift-invariant (translation invariant) operator, strongly related to the Minkowski addition [12]. The objective at this

point of the process was to extract patterns drawn by the objects of interest and any other artifacts that remained in the

sea. The extraction of patterns became possible by filling the connected edges and then extracting their boundaries. This

important step had as a result the outlining of the ships.

The aim of this paper was to test the feasibility of automatic motion detection of ships from satellite images and therefore

the coastline was considered as a given and was manually removed after boundary detection. The reduced version of the

triplet images consisting only patterns in the sea were then used for the detection of pattern movements. The reliable

and robust tracking of a ship, over discontinuous snapshots, requires the use of pattern descriptors for identification,

other than proximity; especially considering peak times near ports. Therefore, for each detected pattern, in each of the

three images, the following measures/features were calculated: area, perimeter, roundness, elongation, bounding box

area, eccentricity, centroid, orientation and solidity (shown in Table 3) where the centroid was expressed in the form of

standard coordinates by combining the Geo-information attached to each image and the pictorial coordinates.

Table 3. List of calculated pattern descriptors

Metric Description

Area Actual number of pixels in the region

Perimeter Length in pixels around the boundary of the region; calculated as the distance between each

adjoining pair of pixels around the border of the region

Roundness A ratio function to the area and perimeter of the region (dimensionless ratio)

Elongation Minor axis width to major axis length ratio (dimensionless ratio)

Bounding box area The area of the box completely containing the object calculated as the product of the length of the

major and minor axes (dimensionless ratio)

Eccentricity The ratio of the distance between the foci of the ellipse and its major axis length (dimensionless

ratio)

Centroid Specifies the center of mass of the region

Orientation Calculated as the angle of the line segment connecting two points to the equator

Solidity Proportion of the pixels in the convex hull that are also in the region. Computed as area/convex area

(dimensionless ratio)

Image differencing is commonly used for the detection of movement [11] between images of same size. The triplet

images, under analysis, are captured at times separated by intervals of 45 seconds from three different angles (backward,

nadir and forward) and non-standard resolution. Despite the high overlapping of the registered images, image

differencing did not serve as an option due to the already discussed limitations. The detection of motion for each pattern

was performed based on local searching where for two successive images, proximity searches were initiated, on the

second picture, around the centroid points of patterns in the first. The proximity of the search was determined as a

function of the elapsed time between the images, a ship's traveling, near the coast, speed limit and the pixel point

resolution while the distance between the two points is calculated with the Harvesine distance metric [13]. A match was

signified if the proximity search yields a pattern with similar pattern descriptors (roundness, elongation, eccentricity and

solidity while the area and parameter were attributes used in the case of multiple matching) as the initiating pattern. The

centroid points (expressed as coordinates) of the matching tuples are used for calculating the speed of movement, in

nautical miles per hour. Orientation of movement is calculated based on the shaping angle between the equator (x-axis)

and the centroids of the tracked ship during two consecutive images. This was done to avoid miscalculating the

orientation of square shapes and also provides information on the direction of movement.

The discussed process iterated twice for each triplet of images; during the first iteration the backward and nadir images

were analyzed while during the second iteration the accordance between the nadir and forward images was examined.

Each set of results contained the list of matched patterns and the final step involved tracking the presence of each

identified pattern in both sets. It is also important to note that, each pattern is used only once (during the matching

process); in case more than one patterns match the same characteristics then the one closest to the initial state is selected.

5. RESULTS AND DISCUSSION

The effectiveness of the discussed methodology has been tested on the triplets of PRISM images with the proximity

search being limited at radiuses of 230m (which corresponds to a speed of movement at 10 nautical-mph) and 460m

(speed of movement at 20 nautical-mph). Figure 2 shows part of a satellite image capture at nadir overlaid with some

detected and tracked ships. The pattern drawn in red is extracted from the BWD image, the blue with the NDR and the

greed with the FWD sensor. Moving ships are captured at different snapshots where when combined prove their track, on

the same lines, the traces of not moving ships fall on top of each other. Figure 2 also illustrates the problem of extracting

patters very close to the dock or ships at anchor, however with the deployment of the already discussed methodology this

is solved as the limits of the coastline are given.

Figure 2. Example shot were both still and moving ships are successfully tracked

Figure 3 shows instances in which the patterns were correctly matched, the ships are correctly matched regardless of their

difference in speed of movement. Pattern representation with the use of descriptors and the expression of centroids based

on the coordinate system allows the superposition of information contained within the images. The impact of limitations

imposed by satellite images (i.e. the fuzzy/blurry depiction of vessels) may be significantly decreased when the analysis

expands outside the use of single triplets and previously contained information is used for increasing the reliability of

already produced results. Also, the detection of vessels near non overlapping regions is flawed as shown in Figure as

not all three sensors manage the complete coverage of the area.

More interestingly, it has been noticed that the proposed methodology is also valid for the detection of any type of

moving object (large enough) which resides within the captured field of view. Upon the analysis of triplets of the

Larnaca port it was noticed that the patterns drawn from flying aircraft could also be detected as far as the proximity

search was performed for a large enough radius, adequate of capturing an aircraft’s speed (Figure 5).

Figure 3: Example patterns of successful motion detection and tracking (a) (b)

Figure 4. The non-fully overlapping of the images may neglect the tracking of objects that lay outside the fully covered areas

Motion detection from space is possible when images are collected on a regular basis, the implemented experiment

illustrated that the speed and orientation of ships may be calculated adequately, while the trajectory of movement may be

obtained by combining the calculated orientation at each snapshot. Taking it a step further this could be enhanced with

an estimated arrival time; calculated as the length of the trajectory line to the coast over the average speed of movement.

The obtained results could be improved by employing a more sophisticated method for edge detection. Even though the

use of thresholding assists significantly in reducing the computational complexity and time during processing, the result

is highly dependent on the selection of the threshold value (and overall contrast).

Figure 5. Aircraft automatic detection during landing procedure

Undetected ships result from instances where the ship either relies outside the overlapping region or due to under-

sampling its trace changes significantly from one capture to another. Also since most vessels bear some common shape

(i.e. vessels are commonly print as elongated patterns) and operational (i.e. acceleration and deceleration) properties.

Such measures may be used as means of increasing robustness in presence of noise which may be caused due to bad

weather conditions. The number of successfully and unsuccessfully detected artefacts for each triplet are summarized in

. The presented numbers are the result of manually counting end examining the resultant images, however it is important

to stress that most misses occurred upon the detection of small vessels (with areas less than 10px), which the eye could

not identify with certainty. The figures in Table 4 also indicate that the success rate varied from triplet to triplet, this is

mainly due to the fact that the methodology did not consider the differences in the contrast of the images (so that no

additional bias was introduced), the fact that the counting and identification of the vessels, during evaluation, was done

with the eye (no ground truth existed) and finally due to the fact that the images do not completely overlap.

Table 4.The table summarizes the number of detected and missed vessels/artefacts in the sea

Scene ID (NDR) Detected Missed Error (%)

ALPSMN231682900 28 4 12.5

ALPSMN271942900 33 5 13.1

ALPSMN254292905 47 4 7.8

ALPSMN267712905 35 3 7.8

6. CONCLUSIONS

The automatic detection of vessels from satellite images is possible despite the imposed challenges, however, it does not

only require consideration of the object’s properties (speed of movement, ability to change orientation) but also issues

introduced by the large pixel size. Additionally, the accurate and reliable detection of a vessel’s track requires the use of

information of multiple and sequential triplets. The more times a pattern is identified the higher the chance of generating

reliable and plausible conclusions on the trajectory and route of a ship, while, the use of several dimensionless metrics

serves relatively well the objective of pattern identification when the depiction of an object is not accurate.

As we have already mentioned in this paper, along track stereoscopic view along with an effective automatic motion

detection methodology can provide the position and the velocity of moving vessels. More generally, as ALOS PRISM

along with any other Satellite system could not provide near real time data, an airboard platform is needed for real time

surveillance. Thus, a (possible) Remote Piloted Aircraft Systems (RPAS) could be an option for real time information

which is capable to carry an along track sensor payload with onboard processing for automated motion detection the

appropriate communication channel in order to send information on the moving vessels (targets) in near real time to a

control center in an appropriate format.

ACKNOWLEDGEMENTS

The authors would like to thank Dr. Takeo Tadono of the Earth Observation Research Center (EORC) at the Japan

Aerospace Exploration Agency (JAXA) for his contribution related to the provision of ALOS PRISM datasets.

REFERENCES

[1] Y. Shubham, T. Rupali, J. Ankush, and B. Roshan, “Crack Detection of Medical Bone Image Using Contrast

Stretching Algorithm with the Help of Edge Detection,” Int. J. Sci. Eng. Technol., vol. 4, no. 3, pp. 223–227,

2015.

[2] T. Tadono, H. Ishida, F. Oda, S. Naito, K. Minakawa, and H. Iwamoto, “Precise Global DEM Generation by

ALOS PRISM,” Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. II–4.

2014.

[3] J. Takaku and T. Tadono, “PRISM On-orbit Geometric Calibration and DSM Performance,” IEEE Trans.

Geosci. Remote Sens., vol. 47, no. 12, pp. 4060–4073, 2009.

[4] R. K. Mishra and Y. Zhang, “Moving Vehicle Extraction from One-Pass WorldView-2 Satellite Imagery,” in

Proceedings of Global Geospatial Conference 2012, 2012.

[5] A. Kääb, “Vehicle velocity from WorldView-2 satellite imagery,” IEEE Data Fusion Contest 2011, 2011.

[6] H. Runge, C. Laux, R. Metzig, and U. Steinbrecher, “Performance analysis of virtual multi-channel ts-x sar

modes,” in Proceedings of EUSAR European Conference on Synthetic Aperture Radar 2006, 2006.

[7] D. Weihing, S. Suchandt, S. Hinz, H. Runge, and R. Bamler, “Traffic Parameter Estimation Using TerraSAR-X

Data,” in Proceedings of the International Society for Photogrammetry and Remote Sensing (ISPRS) Congress,

2008.

[8] R. C. Gonzalez, R. E. Woods, and B. R. Masters, “Digital Image Processing, Third Edition,” J. Biomed. Opt.,

vol. 14, no. 2, 2009.

[9] T. Caelli, Structural, Syntactic, and Statistical Pattern Recognition: Joint IAPR International Workshops SSPR

2002 and SPR 2002, Windsor, Ontario, Canada, August 6-9, 2002. Proceedings. Springer.

[10] J. Friel, Practical Guide to Image Analysis. ASM International, 2000.

[11] A. Mitiche and J. Aggarwal, Computer Vision Analysis of Image Motion by Variational Methods. 2014.

[12] P. K. Ghosh, “An algebra of polygons through the notion of negative shapes,” CVGIP Image Underst., vol. 54,

no. 1, pp. 119–144, Jul. 1991.

[13] R. . Sinnott, “Virtues of the Haversine. Sky and Telescope,” vol. 68, no. 2, p. 159 (1984).

AUTOMATED MOTION DETECTION FROM SPACE IN SEA SURVEILLANCE

Documents

Transcript of AUTOMATED MOTION DETECTION FROM SPACE IN SEA SURVEILLANCE