+Bioline International Official Site (site up-dated regularly)

International Journal of Environmental Research
University of Tehran
ISSN: 1735-6865 EISSN: 2008-2304
Vol. 3, Num. 4, 2009, pp. 493-502

International Journal of Environmental Research, Vol. 3, No. 4, 2009, pp. 493-502

Urban Expansion Simulation Using Geospati Information System and Artificial Neural Networks

Pijanowski, B.C.¹, Tayyebi, A.^2*, Delavar, M. R.² and Yazdanpanah, M. J.³

¹Department of Forestry & Natural Resources, Purdue University, West Lafayettee, USA ²Department of Surveying & Geomatics Eng., College of Eng., University of Tehran, Iran³School of Electrical and Computer Eng., College of Eng., University of Tehran, Iran
*Corresponding author E-mail: amin.tayyebi@gmail.com

Received 7 Nov. 2008; Revised 16 Aug. 2009; Accepted 25 Aug. 2009

Code Number: er09054

ABSTRACT

Urban Expansion Model (UEM) was adapted to simulate urbanization which implements Geospatial Information Systems (GIS), Artificial Neural Networks (ANNs) and Remote Sensing (RS). Two satellite imageries with specific time interval, socio-economic and environmental variables have been employed in order to simulate urban expansion. Socio-economic and environmental variables were used as inputs while construction and non-construction areas were used as outputs to train the neural network. Calibration of proposed model was performed with area under the ROC Curve (AROC) and Kappa Statistic (KS) which are non-shape performance metric. A real-life case study of Tehran Metropolitan Area (TMA) is presented to demonstrate the process. This paper presents a version of the UEM which parameterized for TMA and explores how factors such as road, building area, service centre, green space, elevation, aspect and slope can influence urbanization. Having urban expansion model with specific time interval and assuming the existence of the same rate of urbanization, new construction areas of region can be predicted. The overall accuracy of the model to predict new construction areas was 80% and 78% with AROC and KS, respectively.

Key words: Urban Expansion Model, Geospatial Information System, Artificial Neural Network, Remote Sensing, Satellite Imagery, Classification

INTRODUCTION

Rapid urbanization has caused not only social problems but also environmental problems in most of the mega-cities. Complexity of urban systems necessitates the consideration of interdependency among various factors for urban expansion modelling. Urban expansion models attempt to project future changes in urban based on past trends and the drivers thought to determine conversions of land between different categories. The expansion of urban areas is determined by the interaction of three broad types of phenomena: the physical constraints of geography and environment, the demand for land by the households and firms who inhabit the city, and the policy constraints that govern land use and spatial interactions in the city. As sustainable development becomes a goal for many urban communities, urban expansion has drawn more public attention. It can provide local land-use planners and regional resource managers with information about the potential effects of urban expansion on the environment. A severe problem for modelling urban expansion has been the lack of spatial explicit data. RS and GIS provide us with an efficient tool to monitor urban changes in urban areas during past three decades. With time series satellite data, long-term urbanization can be modelled.

A method for integrating ANN, GIS and Cellular Automata (CA) for the purpose of simulating different development patterns based on the planning objective was presented (Li and Yeh, 2002). They refined this model dealing with multiple regional land-uses and simulations for alternative development scenarios (Yeh and Li, 2003); however, their investigations did not ever scale down at the intra-urban level. Urbanization is a rapid land-use change process that produces different patterns depending on the proximity to large urban cities across the landscape (Wu 2004). The Land Transformation Model (LTM) is a land use change model that uses ANNs and GIS (Pijanowski et al., 2000, 2002 & 2005). Monitoring the urban expansion with multi-temporal GIS maps was performed in Iran (Tayyebi et al., 2008a, b). KAPPA STATISTIC is used for measuring of accuracy assessment (Congalton et al., 1983; Monserud and Leemans, 1992; Congalton and Green, 1999; Smits et al., 1999; Wilkinson, 2005). A cross tabulation matrix is sometimes called a transition matrix when it compares two maps from different times (Pontius et al. 2004b). All values on the diagonal indicate agreement between the two maps and all values off the diagonal indicate disagreement. Cross tabulation matrices are used regularly to measure the spatial accuracy of raster maps (Congalton and Green, 1999) and more generally to quantify the association between two categorical maps for a variety of reasons (Pontius et al. 2004a, b & Pontius and Spencer 2005). Eight calibration metrics are used to estimate model goodness of fit: four location-based measures and four patch metrics based on patch size, shape and configuration (Pijanowski et al., 2006).

This paper provides adaption of UEM which utilizes transportation factors, proximity or density of important landscape features as inputs while construction and non-construction areas as outputs to model urban expansion over large regions. Two LANDSAT images of TMA in 1988 and 2000 were rectified and registered to Universal Transverse Mercator (UTM) WGS 1984 zone 39N. Supervised classification was used to classify the images to different land-use categories. In addition four and two land-use classes that were extracted from first and second satellite image respectively, three other classes were added from topographic data. Then, database was included these nine classes: road, building area, service centre, green space, and elevation, slope, aspect, construction and non-construction areas. The proposed model was calibrated with these two nonshape performance metrics: AROC and KS. Having urban expansion between 1988 and 2000 and assuming the existence of the same rate of urban change, new construction areas of TMA has been derived for 2012. The overall accuracy of the model to predict new construction areas was 80% and 78% with AROC and KS, respectively. Materials and methods provide a basic principle of ANNs is applied to UEM. Results and discussion present UEM which parameterized for TMA, using a set of spatial interaction rules that are derived from GIS routines.

MATERIALS & METHODS

A Multi-Layer Perceptron (MLP) is designed to approximate an unknown input-output relation by determining the weight of each connection via learning rules. The scheme of work that neural network undertake, starts with design of the neural network and identifies the inputs using a data, using subset of the inputs, the network was trained, then neural network testing was performed using the other part of data set of inputs and the final stage was to use the information extracted from the neural network to predict new construction areas. The net contains an input layer with multiple units, a hidden layer with multiple units, and an output layer with only one unit. Fig.1 shows a typical feed-forward back-propagation neural network (Pijanowski et al., 2002).

Binary sigmoid function, one of the most typical activation functions, is used in this study. A coefficient can be introduced to the activation function; this is called a bias. A training-set with a statistically significant number of pixels for each category has been generated. The learning of the neural network has then been carried out by feeding it with pairs of vectors: the input vector contained the number of measurements; the output vector contained the corresponding known class of construction and non-construction areas. The procedure has then been tested on a set of pixels not belonging to the training set. The output unit receives a signal from each hidden layer and sums the signals with corresponding weights and computes the output. The weights can be determined using the robust Back-Propagation (BP) algorithm. The algorithm randomly chooses the initial weights, and compares the calculated output for a given observation with the expected output for that observation. Using the Mean Squared Error (MSE), the difference between the expected and calculated output values across all observation is outlined. After all observations are submitted to the network, the weights are modified according to a generalized delta rule to distribute total error among the various units in the network. The UEM follows eight sequential steps including: (1) rectification and registration; (2) classification; (3) integrating topographic data in database; (4) coding of data to create spatial layers of predictor variables; (5) applying spatial or non-spatial function in ArcGIS; (6) integrating all input grids; (7) calibration of UEM; and (8) temporal prediction. The GIS portion of the UEM is encoded in ArcGIS 9.2 Avenue scripting language. A collection of routines written in Java is used to process and analyze data.

In rectification and registration, the images were geometrically rectified and registered. At this step, combinations of suitable bands were performed to prepare images for better classification on both satellite imageries. In classification, combined images were classified to classes of interest on first and second satellite imageries. First and second satellite imageries were classified to land-use categories and construction and non-construction areas, respectively. These different classes were converted to separate spatial layers that can be used to train and test neural network. In integrating topographic data in database, other classes of necessitate topographic data were added to database. In coding of data to create spatial layers of predictor variables, inputs are generated from a series of base layers that are stored and managed within a GIS. These base layers represent land uses or features in the landscape. Grid cells are coded to represent variables as either binary or continuous variables depending on the type of attribute. Applying spatial or non-spatial function in ArcGIS relates input variables to land-use transitions for each location in an area. We use three classes of functions in ArcGIS: neighbourhoods or densities function; site specific characteristics function; and distance function from the location of a predictor cell. In integrating all input grids, ANN technique was utilized to integrate all input grids. The output from this step is a map of likelihood values, which specifies the relative likelihood of change for each cell based on the predictor variable values. In calibration of UEM, AROC and KS were used to estimate model goodness of fit. In temporal prediction, after the net was trained and calibrated respectively, biases and weights were obtained, the feed-forward algorithm was used for the net with new topology. In this topology, spatial layers which were extracted from second satellite imagery were replaced with spatial layers from first satellite imagery as input for net. After it, the feed-forward algorithm was used for prediction. Output of this net means the future new construction areas for that region.

RESULTS & DISCUSSION

TMA is located in North of Iran which exhibited accelerated rates of urban expansion over the last three decades which resulted from a high population growth rate and increased ruralurban migration. Being the capital of Iran, TMA has undertaken a great deal of economical developments in term of urban change and the rapid growth of infrastructure. TMA with a day time population of some 10 million and with a metropolitan area of over 2000 square kilometres is the centre of the national government and of commercial, financial, cultural and educational activities in Iran.

National Cartographic Centre (NCC) database was used as the main source of data for TMA. Two LANDSAT TM imageries of TMA with 28.5m resolution were taken at 1988 and 2000. LANDSAT images were corrected for atmosphere attenuation and geo-referenced to common base maps. In total 72 Ground Control Points (GCPs) were selected on both images and used for geometric correction. NCC database with 1:25000 scales and 10m resolution was used as the source of topographic data. NCC topographic data were integrated with our database to provide the appropriate inputs to the GIS-based model. Locations of service centres were obtained from published county road maps and stored as point coverage. Data on land-use, transportation, natural features, public lands, digital elevation and political boundaries were incorporated into the Arc/Info 9.2 software.

The first step in assessing the variables is to determine the factors affecting the suitable urban change on the basis of an analysis of existing studies. Input layers represent phenomena which may influence the model. From previous work (Tayyebi et al., 2008a, b), it is found out that the following seven independent variables will influence urban expansion in TMA which including: slope, elevation, aspect, building area, service centre, green space and road.

Both images were geometrically rectified and registered to the Universal Transverse Mercator (UTM) WGS 1984 zone 39N. Registration errors were about 0.50 pixels. In addition to, combinations of RGB bands of LANDSAT images were performed to prepare satellite imageries for better classification. Fig.2 shows the final results for the 1988 and 2000 LANDSAT images of TMA.

In order to have less process and time saving for classification, TMA were extracted from two LANDSAT imageries. The first 1988 and second 2000 LANDSAT imageries were subjected to a classification of zones with ENVI software. Supervised classification was utilized to classify the images to different land-use categories. All land-use classes of TMA were also reclassified from their original classification toAnderson Level I (Anderson et al., 1976) for the modelling exercises. In order to classify both images, four classes of interest were selected from different classes in the images namely: road, building area, service centre and green space. The overall accuracy for the classification of LANDSAT TM image 1988 was 86.12%, while it was 87.64% for LANDSAT TM image 2000. Fig.3 shows the image classification results for TMA.

Absorbing excursion spaces contain distance from service centre, green space and building area. Another important factor is the distance of each cell has from the nearest road cell calculated. Landscape topography is an influential factor contributing toward building area utilization. Elevation should be important in this landscape that is prone to flooding. Slope and aspect are important to developers who want to minimize landscaping costs. The value of driving variable grids represented the potential accessibility of a location for urban development. Then, for each cell, there is a vector of 7 by 1 measurements as input. Fig.4 shows seven variables which were compiled inArc/Info Grid format as inputs at 1988.

There are two constraints for prediction of new construction areas. First, cells that are construction in 1988 are obviously not candidates for new construction areas in 2000. Second, cells that are protected legally from new construction areas are assigned the absolute lowest suitability value in the final suitability maps. The second LANDSAT imagery was subjected to a classification of two zones, construction and nonconstruction areas with ENVI software. The output vector was coded to represent cells as binary variable which values from 0 (nonconstruction) to 1 (construction). Fig.5 shows this variable as output which was compiled in Arc/ Info Grid format at 2000.

Neural Network Toolbox of Matlab software was used for the design, training and prediction. The neural network was designed to have a flexible number of inputs depending on the number of predictor variables, an equal number of hidden units as input units and an output unit (Fig.6). All input grids were normalized to a range from 0 to 1. The output layer contained binary data that represented whether a cell location changed (1= construction; 0= non-construction) during the study period. Tansigmoid transfer activation function was used for the activation of hidden neurons. Linear transfer function purelin activation function was used for the output neuron activation (Tsoukalas and Uhrig, 1997). The network was trained with the training data and the overall MSE generated with Matlab software and each cycle was stored in a file for analysis.

Cells that were predicted to transition to new construction areas were compared with the cells that actually did transition during the time period of study. Study area includes 245,588 cells which 85,956 (35.0%) of cells have limitation to undergo transition while 159,632 (65.0%) of cells can be subjected to transition in TMA. ANN estimated 127,706 (80.0%) of qualified cells had change likelihood values of 0 while 3,193 (2.0%) had likelihood values of 1 and other cells 28,733 (18.0%) have value between 0 to 1. Cells with values closest to 1 were selected as locations most likely to transition. The critical threshold value, which is the lowest change likelihood value selected for cells to transition during the 12-year period, was 0.5. Only 3.5% of all qualified areas changed to urban in the observed databases. Results show that only 5,587 cells undergo transitions in TMA.

KAPPA STATISTIC quantifies the level of agreement between two maps compared against the null hypothesis that the maps do not differ by chance from a random map. It is followed Pontius (2002) and Sousa et al. (2002) who suggest that KS values of less than 0.4 reflect poor performing models, 0.4 to 0.6 are fair and 0.6 to 0.8 are good and KS values greater than 0.8 represent excellent agreements between model and observed datasets. The ROC curve plots the rate of true positive to positive classifications against the rate of false positive to negative classifications as threshold value is varied between 0 and 1. We use a nonparametric approximation using SPSS (SPSS Inc, 2003) to estimate the area under the curve that is produced by varying the threshold and plotting. AROC is calculated for each of the simulations and plotted AROC as a function of simulation training cycle and percentage of urban change. All two are derived from a standard twoby-two contingency table (Pontius 2002) created when simulated and observed maps are compared (Table 1).

For each 100 cycle intervals AROC and KS were computed and results plots against each other (Fig.7). The accuracy of the model after 5,000 cycles to predict new construction areas in TMA was 80% and 78% for AROC and KS, respectively (Fig.8).

For the rank order of input variables according their influence on the model performance, we compared the predictive ability of the 7 versions of the reduced-variable model of input variables. We excluded each variable and perform UEM with other six-variables respectively. We saved the AROC for each of the six-variable models across all of the training cycles with 100 cycle intervals. The resultant value represents the relative effect, of each predictor variable on the model performance. The AROC values of the sixvariable models with maximum cycles were compared against each other and it was concluded that the rank order of predictor variables according their influence on the model performance was respectively: distance to building areas, distance from green space, distance from road, slope, distance from service centre, aspect and elevation.

Interestingly, six-variable models curves appear to adhere to two major inflection points. At the first, during first 1000 cycle intervals, nearly all of the predictor variables have a large direction to a positive slope. The second inflection point occurs about 3500 cycles were adequate to stabilize the AROC values.

After the net was trained and calibrated successfully, biases and weights were obtained, the feed-forward algorithm was used for prediction. Having urban expansion model between 1988 and 2000 and assuming the existence of the same rate of urban change, new construction areas of TMA has been derived for 2012 based on 2000 full dataset. The suitability map in ANN is created based on seven independent input variables. Fig.9 shows result of predictions of new construction areas in 2012. Future urban expansion appears to be focused west of the TMA. A great deal of clumped development is anticipated in the west portion of the TMA. In addition, a great deal of dispersed development is anticipated in the south portion of the TMA. But, there is nearly predicted no development into the centre and north of the TMA. Because in centre of TMA, there is no space that can be developed and in north of TMA, there is legal restriction for development with government. What is interesting from these forecasting results is that the model predicts new construction areas occur along building areas.

CONCLUSION

Models of urban expansion are tools to support the analysis of the causes and consequences of urban change in order to better understand the functioning of the urban expansion system and to support urban planning and policy. The UEM presented in this paper examines the relationship between seven predictor variables and urbanization. Integration ANNs with GIS is essential for modelling urban changes because of the spatial nature of many the input variables. The use of an appropriate relationship model is critical for a reliable prediction of future urbanization. The alternative approach of computational neural network examines the relationship between seven predictor variables and urbanization, and achieves higher overall predictive ability than other method when facing a complex system. We employed AROC and KS calibration metric to find different information on how well the model compares to observed urban expansion. The overall accuracy of the model to predict urban expansion was 80% and 78% with AROC and KS respectively for TMA. With developing seven versions of the UEM, each with one of the variables removed, we assessed the relative contributions of each variable on model performance with AROC calibration metric consideration. Distance to building areas was the best variable of urban expansion during the 1988-2000 periods in TMA. However, we found that the model for TMA, where urban expansion is much clumped.

ACKNOWLEDGEMENTS

The topographic maps provided by NCC are highly appreciated.

REFERENCES

Anderson, J. R., Hardy, E. E., Roach, J. T. and Witmer, R. E. (1976). A land use and land cover classification system for use with remote sensor data. US Geological Survey, Professional Paper 964, 28, Reston, VA.
Congalton, R. G., Oderwald, R. G. and Mead, R. A. (1983). Assessing Landsat classification accuracy using discrete multivariate analysis statistical techniques. Photogrammetric Engineering and Remote Sensing, 49, 1671-1678.
Congalton, R. G. and Green. K. (1999). Assessing the accuracy of remotely sensed data: principles and practices, Lewis, Boca Raton FL.
Li, X. and Yeh, A. G. O. (2001). Calibration of cellular automata by using neural networks for the simulation of complex urban systems. Environment and Planning A, 33(8), 1445-1462.
Li, X. and Yeh,A. G. O. (2002).Urban simulationusing neural networks and cellular automata for land use planning. In Proc. ISPRS Commission IV on Geospatial Theory, Processing andApplications, Ottawa, CA. Vol. XXXIV, part 4, ISSN 1682-1750.
Monserud, R. A. and Leemans, R. (1992). Comparing global vegetation maps with the Kappa statistic. Ecological Modelling, 62, 275-293.
Pijanowski, B. C., Gage, S. H. and Long, D.T. (2000).A land transformation model: Integrating Policy, Socioeconomic and environmental drivers using a geographic information system. In Landscape Ecology: A Top Down Approach, J. Sanderson and L. Harris (Eds) (Boca-Raton: CRC Press, Lewis Publisher).
Pijanowskia, B. C., Brown, D. G., Shellitoc, B. A. and Manikd, G.A. (2002). Using neural networksand GIS to forecast land use changes: a land transformation model. Computers, Environment and Urban Systems, 26(6), 553-575.
Pijanowski, B. C., Pithadia, S., Shellito, B.A. and Alexandridis, K. (2005). Calibrating a neural networkbased urban change model for two metropolitan areas of Upper Midwest of the United States. Int. J. Geog. Inform. Sciences, 19, 197-215.
Pijanowski, B. C., Alexandridis, K. T. and Muller, D. (2006). Modelling urbanization patterns in two diverse regions of the world. Journal of Land Use Science., 1(2-4), 83-109.
Pontius, Jr. R. G. (2002). Statistical methods to partition effects of quantity and location during comparison of categorical maps at multiple resolutions. Photogrammetric Engineering and Remote Sensing, 68, 1041-1049.
Pontius, Jr., R. G. and Suedmeyer, B. (2004). Components of agreement and disagreement in categorical maps at multiple resolutions. In Remote Sensing andGISAccuracyAssessment, R. S. Lunetta and J. G. Lyon (Eds.), 233-251, Boca Raton FL: CRC Press.
Pontius Jr., R. G., Huffaker, D. and Denman, K. (2004a). Useful techniques of validation for spatially explicit land-change models. Ecological Modelling, 179(4), 445-461.
Pontius Jr., R. G., Shusas, E.and McEachern, M.(2004b). Detecting important categorical land changes while accounting for persistence. Agriculture, Ecosystems & Environment, 101(2-3), 251-68.
Pontius Jr., R. G. and Spencer, J. (2005). Uncertaintyin extrapolations of predictive land change models. Environment and Planning B., 32, 211-230.
Pontius Jr., R. G. and Cheuk, M. L. (2006).A generalized cross-tabulation matrix to compare soft-classified maps at multiple resolutions. Int. J. Geog. Inform. Science, 20(1), 1-30.
Smits, P. C., Dellepiane, S. G. and Schowengerdt, R. A. (1999). Quality assessment of image classification algorithms for land-cover mapping. International Journal of Remote Sensing, 20, 1461-1486.
Sousa, S., Caeiro, S. and Painho, M. (2002).Assessment of map similarity of categorical maps using Kappa Statistics: The Case of Sado Estuary. Paper presented at the ESIG 2002, Tagus Park, Oeiras.
SPSS Inc. (2003). SPSS for Windows (Version 12.0). SPSS Inc., Chicago, USA.
Tayyebi, A., Delavar, M.R., Saeedi, S. and Amini, J. (2008a). Monitoring the urban expansion by multitemporal GIS maps. Paper presented at the FIG conference, Stockholm, Sweden.
Tayyebi, A., Delavar, M.R., Saeedi, S., Amini, J. and Alinia, H. (2008b). Monitoring land use change by multi-temporal LANDSAT remote sensing imagery. Proc. ISPRS CommissionVII, The International Society for Photogrammetry and Remote Sensing, Beijing, China.
Tsoukalas, L. H. and Uhrig R. E. (1997). Fuzzyand neural approaches in Engineering. John Wiley and Sons, Inc., NewYork, 189-289 and 385-405.
Wilkinson, G. G. (2005). Results and implications of a study of fifteen years of satellite image classification experiments. IEEE Transactions on Geosciences and Remote Sensing, 43, 433-440.
Wu, J. (2004). Effects of changing scale on landscape pattern analysis: scaling relations. Landscape Ecology, 19, 125-138.
Yeh,A. G. and Li, X. (2003). Simulation of development alternatives using neural networks, cellular automata, and GIS for urban planning. Photogrammetric Engineering and Remote Sensing, 69, 1043-1052.

The following images related to this document are available:

Photo images

[er09054f8.jpg] [er09054f7.jpg] [er09054f4.jpg] [er09054t1.jpg] [er09054f2.jpg] [er09054f1.jpg] [er09054f9.jpg] [er09054f3.jpg] [er09054f5.jpg] [er09054f6.jpg]