1、(Abstract This paper presents a real-time colour image classification algorithm for mobile robot navigation based on the advanced technology of Field Programmable Gate Arrays (FPGA). In order to calibrate the object colour in different lighting conditions, a kind of statistic ellipsoidal model is ad
2、opted. We build a 3-D Colour Look-up Table (CLUT) in which only 18 bits are used to represent a kind of colour instead of conventional 24 bits. This method resolves the problem of object overlapping in a colour space, and has the merits of higher classification accuracy and lower memory cost than tr
3、aditional ones. It is implemented on FPGA, which can highly reduce the CPU computation burden and remarkably improve the performance of robot vision systems. This method is validated by the applications on a mobile robot in the research.I. INTRODUCTIONEAL-time image processing is one of key challeng
4、es in mobile robot navigation. Although parallel processing computers offers a reasonable solution, they are expensive and may not easy to embed on a robot. Programmable hardware in the form of FPGA has been proposed as a conventional platform to exploit image processing in order to achieve high per
5、formance 123. Based on FPGA, we design an intelligent frame grabber (IFG), which can not only grab but also process images in real time.In the vision system of intelligent mobile robots, classifying the pixels in an image into different colour classes not only provides solid foundation for navigatio
6、n, decision-making and self-localization of mobile robots, but also improves the performance of a vision system greatly. The approach to accomplishing this task includes linear colour thresholding, nearest neighbour classification, colour space thresholding and probabilistic methods. Colour space th
7、resholding has real-time performance, and can not adapt the variety of lighting conditions, which restricts its use in many real-world applications. Therefore, many researchers devoted to improve the approach and made some progresses 56. But the methods developed so far did not change the overlap ph
8、enomenon of target colour classes since they defined the colour classes in 1D or 2D colour space. A 3D Colour Look-Up Table (CLUT) is seldom used in a vision system because of high memory cost. In this paper, a 3D CLUT is built only using 18-bit integers, which not only Qingrui Zhou, Kui Yuan, and W
9、ei Zou are with the Institute of Automation, CAS, Beijing, China. E-mail: H. Hu is with the Department Computer Science, University of Essex, Colchester CO4 3SQ, United Kingdom. E-mail: hhuessex.ac.uk. keeps the good classification result but reduces the memory cost greatly. To ensure the adaptatio
10、n of the classification algorithm to lighting conditions, target colours are calibrated based on a model of statistic ellipsoid. Then, we implement the algorithm using FPGA on an IFG, which improves the performance of the vision system and has been used in RoboCup middle size robots. The rest of thi
11、s paper is organized as follows. Section II describes the algorithm of image target classification by colour. Section III introduces the intelligent frame grabber and the algorithm implementation on FPGA. In section IV, experimental results are presented to show the performance of the proposed syste
12、m. Finally, a brief conclusion and future work are summarized in Section V. II. ALGORITHM FOR CLASSIFICATIONA. Colour SpaceUp to now, several colour spaces have been widely used in mobile robot navigation, including HSV, YUV and RGB, each of which has peculiar characteristics and utilities. The choi
13、ce of colour space for classification depends on several factors such as the type of the digitizing hardware and the utility for a particular application.RGB is the most commonly used model for a TV system and pictures acquired by digital cameras. Video monitors display colour images by modulating t
14、he intensity of three primary colours (red, green and blue) at each pixel of an image. RGB is suitable for colour display, but not good for colour scene segmentation and analysis because of the high correlation among the R, G and B components. By high correlation, we mean that if the intensity chang
15、es, all the three components will change accordingly. Also, the measurement of a colour in a RGB space does not represent colour differences in a uniform scale. Hence, it is impossible to evaluate the similarity of two colours from their distance in a RGB space.YUV is also a kind of TV colour repres
16、entation suitable for European TV system, which separates colour information of an image from its intensity information. The chrominance is coded in two dimensions (U and V) while the intensity is coded in the Y component. Thus a particular colour can be described as column spanning all intensities.
17、 This colour space is therefore more useful than RGB for classification by colour.The HSI (Hue, Saturation and Intensity) system is another commonly used colour space in image processing, which is more intuitive to human vision system. However, HSI FPGA-Based Colour Image Classification for Mobile r
18、obot NavigationQingrui Zhou, Kui Yuan, Wei Zou, Huosheng Hu, Senior Member, IEEERcolours cannot be provided by digitising hardware directly, and the non-linear transformation from RGB or YUV to HSI needs expensive computation. So the HSI colour space is not suitable for a real-time vision system.B.
19、Analysis of AlgorithmColour space thresholding is widely used to improve the real-time performance of classification. This approach is to use a set of constant thresholds defining a colour class as a rectangular block in the colour space. A particular pixel is then classified according to which part
20、ition it lies in. Bruce etc studied this approach and improved the efficiency of classification 5. Using this approach, classifying a pixel into one or more of up to 32 colours only use two logical AND operations. A naive approach could require up to 192 comparisons for the same classification. Howe
21、ver, the shape of colour classes in the applications is often not a polyhedron, so the improper classification is inevitable.Tang etc. extended the approach by using the three RG, RB and GB projections of the 3D colour class to describe the colour class 6, which is called 2D calibration in sequence.
22、 The shape of colour classes is enriched, and the accuracy of classification and divisibility of several colours are improved. But, this approach still cannot describe the colour classes perfectly, and the overlaps of similar colour classes in a colour space still exist. The cause of overlap is that
23、 a lot of additional colours are included in colour classes when restoring the colour class from three 2D projections. Fig. 1 shows that similar colours could not be classified correctly by 2D calibration. Fig. 2 shows the distribution of two colour classes in a colour space.It is the best way to de
24、fine directly the colour class in a 3D colour space and check whether a pixel is a number of colour classes defined by a 3D CLUT. However, the common colour space (RGB, YUV) is a very huge 3D space. If attribution of each colour is stored, the CLUT has very high cost on memory because there are 1677
25、7216 kinds of colours in colour space. So this approach has not been used.In this paper, by analysing the effect on quality of images of every component of YUV we build the CLUT only using 18 bits instead of conventional 24 bits, which keeps the classification effective and reduce the memory cost of
26、 CLUT greatly. Therefore, the YUV colour space is used in our approach.By conducting lots of experiments, we found that representing a pixel only using the high 6-bit (bit1 and bit0 be set to 0) of each colour component has not obviously bad effect on classification. Fig. 3 shows that there are no o
27、bvious differences between Fig. 3(a) and Fig. 3(b). However, the saltus of colour and grey can be found in Fig. 3(c), which can obviously induce the improper classification when the target colours are similar.Fig. 1: Experiment for 2-D calibrating. The image on the left is original image, in which t
28、here are two targets: orange ball and yellow gate. The result of classification shows in right.a) b) Fig. 2: The distribution of two colour classes in colour space. a) Two colour classes: red and yellow colour blobs represent orange and yellow target colour classes respectively. b) Overlap of two co
29、lour classes. (a) (b) (c) Fig. 3 Three representations of the same image with variation in the number of grey levels used: (a) 8-bit; (b) 6-bit; (c) 5-bit. The four pictures in (a), (b) and (c) are colour image and grey images of Y, U and V in turn from left to right. In (b) and (c), representing a
30、pixel only used high 6-bit and 5-bit in each component, and low 2-bit and 3-bit be set to “0”. C. Description of AlgorithmWe use a 6-bit integer to represent a pixel and define the colour classes in a 3D space, so there are only 262144 (218=256K) kind of colours in the colour space and a binary code
31、 of colour classes is stored in the CLUT instead of the data used in 1D and 2D calibration. When the target colours are less than 16, each element in CLUT only occupies 4-bit memory, so it is easy to know the memory cost of CLUT is 128k bytes ( ) while the memory cost of 2-D bit4236calibrating is 38
32、4K ( ) bytes.it82To improve the adaptability of the algorithm to lighting conditions, a colour training based on statistic ellipsoid model is used while initialising the 3-D CLUT. The algorithm is as follows:Step 1: initialisationDefine and initialise a 3-D array YUV646464 by “0”, and encode the col
33、our classes by 4-bit (target colours in our vision system are less than 16). That is, each array element of the CLUT is a 4-bit integer. Step 2: colour trainingWe now begin colour training. Firstly, select a target colour area by mouse. Colours of pixel in this area are continuous and follow approxi
34、mately the normal distribution. We calculate the statistical characteristics of each pixel selected: mean and variance ),(vuyof Y, U and V components as follows:),(vuy(1)SivSiuSi nnyn1,1,1(2)iyiy22)(3)Siuiun)(1(4)iviy22)(where S and n denote the selected target area and the total number of pixels wi
35、thin the target area respectively. Then we construct an ellipsoid in the colour space using and as centre and radiuses ),(vuy),(vuyrespectively,(5)1: 222 )()()( vvuyy where denotes an any given positive number. According to probability theory 7, we have(6)295.0)(xdfwhere x follows normal distributio
36、n, and f(x) and are its density function and variance. Thus, we select in the 2ellipsoid defined above. At last we take all the colours in this ellipsoid as target colour in this lighting condition and execute the following operations:26y; (7)uv)u,(;v;colr tagefCod 6YUVywhere means y is shifted righ
37、t by 2 bits. 2Repeat the same operations for all target colours in all the possible lighting conditions. Step 3: classificationAfter finishing colour training, the attribution of each colour is stored in a CLUT. For each pixel , let ),(pvuyP26py(8)uvBy using the vector as indices to CLUT, )6,(pythe
38、return value from CLUT is exactly the code of a colour class the pixel belongs to. Experimental results show that the algorithm can classify the pixels in real time and accurately in different lighting conditions (Fig. 4). III.IMPLEMENTATION OF CLASSIFICATIONThe algorithm in Section II can achieve t
39、he classification rapidly and accurately; however there are many other tasks need to do in intelligent mobile robots, for example, multi-sensor fusion, multi-robot communication and route planning, et al. The computing power of CPU is always a bottleneck for improving real-time performance of a robo
40、t. It is our aim to reduce the computing burden of image processing and improve the performance of the whole system at the same time. In order to solve the problem to implement some image processing algorithms in hardware without increasing the system cost, we design an intelligent frame grabber (IF
41、G) based on high performance FPGA, which makes it possible to implement algorithms in hardware.A. Hardware architecture of IFGFig. 5 shows the architecture of the IFG. As we can see, the IFG is composed of a PCI interface chip, a video AD, FPGA and high-speed SRAM. The PCI interface chip is Fig. 4 E
42、xperiment of classification. The image on left is original image. The result of classification shows in right.PLX PCI9054, which provides the interface to PCI bus at one side and the local bus at the other side. The video AD is SAA7113, which is a I2C-bus controlled video input processor, and can ou
43、tput data in YUV4:2:2 and provide real-time status information. 1 Mbytes high speed SRAM is used to store CLUT and images on IFG. High performance FPGA on IFG implements the real-time classification and controls the whole board of IFG including the frame grabber, access to SRAM and bus arbitration.B
44、. Implementation of ClassificationThe implementation of classification on IFG is shown in Fig. 6, where the modules inside the square denote the FPGA implementation. The control module is the control centre of FPGA, which assigns FPGA to work on the status of colour training or classification accord
45、ing to the command from CPU. It provides the interface to SRAM, the coefficients configuration of a digital filter and control signals to the video AD. Using the information provided by the video AD, video data control module samples the valid video data and reassembles data in the format of YUV888.
46、 These data is fed into a digital filter that is a variable coefficient filter or image algebra operator 12. The address generator of classification uses Ybit7-2,Ubit7-2 and Vbit7-3 as a 17-bit address of CLUT and Vbit2 as a control signal of MUX-2 I. MUX-2 I is a 4-bit multiplexer, which selects hi
47、gher or lower 4-bit in 8-bit from SRAM as a code of the target colour that current pixel belongs to. The address bus arbitration module receives 17-bit address from the address generator of classification and 20-bit address from a local bus, and then generates a 20-bit address to access SRAM. Data b
48、us control logic implements the data control, which can reassemble 24-bit video data or 32-bit data from local bus and write them in SRAM in the format of 8-bit. Through the unit, 8-bit data from SRAM or from MUX-2 II can be sent to the local bus. Note that MUX-2 II is an 8-bit multiplexed, which is
49、 controlled by Control module. The implementation procedure of the algorithm on FPGA is as follows:Step 1: colour training Colour training must be executed before classification, which is completed by human-computer interaction. CPU outputs colour training commands to Control module via an IO port. Under the control of Control module, the data from AD are sent to CPU through Video data control and Data bus control. This function provides the original image for CPU. After fini