Second Report Draft


Nowadays, consumers demand more on playing computer games. So the focus of our Final Year Project “Tele-Table Project” is to provide a multi-purpose interactive platform with using real objects and connecting the players over Internet. To improve the current game styles and to create new generation of games, the project applies Augmented Reality.

In this semester, we would like to solve the problem of random shuffle of tiles. The tiles in the games are real objects and the images shown on the tiles are virtual images. This reflects the Augmented Reality technology.

In this report, we would discuss the detailed motivation and objective of our project. Also, it goes through our design of architecture of Tele-Table. Furthermore, the implementation and difficulties encountered would also be discussed.



Real games provide instant excitement and Virtual Computer games provide entertainment. To combine both, we use Augmented Reality. Using Augmented Reality, the current game platform can be improved to a new one.

According to the source from NPD Group / Point-of-Sale Information, the U.S. Computer and Video Game Unit Sales Growth and Dollar Sales Growth has been raised 3 times from 1996 to 2006. From this we can forecast that the future market growth of Computer and Video Game will increase gradually.

An investigation on current popular game platforms is done and two main characteristics have been found. They are the networks using in the game and the real objects or virtual images used. Throughout the investigation, there is no game platform using both real objects and virtual images when playing games and connecting the players with networks. This system can provide a real game experience and interaction to the players. As there is still no such game platform in the market, developing this platform can increase the market share so as to gain the competiveness in the game industry.

To use Augmented Reality to develop the Tele-Table, Tele-Table is developed using Augmented Reality. The players use Tele-Table to play games interactively and remotely by internet.


The project aims at providing a generic platform so that the game developers can use the Tele-Table to develop board games, card games and other interactive games.

The designed game platform should achieve the following targets:

1. The change of background is flexible and free so as to develop any games.

2. Real objects can be used to play games interactively with the player on the other side.

3. The connection between two players over internet should be fast so as to provide a realer experience.

4. Applications besides game development should also be discovered.

In this semester, our group would like to solve the problem of playing games with random shuffle of real objects with Tele-Table. This is to achieve Target 2.



Augmented reality (AR) is a subset of Mixed Reality, which deals with combination of real world and computer generated data. The Virtuality Continuum, which was first introduced by Paul Milgram, represents a continuous scale ranging between the completely virtual, a Virtual Reality and completely real.

Augmented Reality supplements the real world with virtual computer- generated objects that appear to co-exist in the same space as the real world:

1. Combines real and virtual objects in a real environment

2. Runs interactively, and in real time

3. Registers real and virtual objects with each other


In order to archive the objective, which is the support of using both real objects and virtual images in this game platform to interact among the players remotely over internet, Augmented Reality is implemented in the system. For example, when playing Chinese Chess Game, the computers generate the background and the players can control the real chesses.



Our system is setup by overhead mounted cameras, plasma monitors and computers.


The overhead cameras are setup for capturing events on the monitors. They are the only input device in the setup. Therefore, all input information would be capture by these cameras. These cameras can be USB web cameras or digital cameras.

Digital cameras provide higher resolution while USB web cameras are more popular and cheaper. Since the captured video will be sent to the computer for analyze, digital cameras are chosen to provide higher resolution in implementation.


The plasma monitors are placed horizontally to act as platforms for placing the cards or tiles to provide the players a better real game experience.

The real objects place on the monitors. At the same time, the background and the virtual images captured by the camera of other side is displayed on the screen of the monitors. To perform an experimental development of the Project, we choose a 19-inch monitor.




To handle the Image Processing Module, we introduce some software technologies into our project. In this sub-chapter, the reason of choosing the software technologies and the limitation are discussed.


Microsoft DirectX is a collection of application programming interfaces (APIs) for handling tasks related to multimedia. It is therefore used for game programming on Microsoft platforms. DirectX includes the APIs began with Direct3D, DirectSound and so on.

Microsoft DirectX has two special features. First, DirectX can access hardware directly and this enhances the high performance of many games. Second, DirectX provides independence for the hardware abstraction layer (HAL). Hardware can be directly access through DirectX interface.

Our group chose Microsoft DirectX 9.0 for our game programming. Microsoft DirectX 9.0 is made up of eight components. We used only DirectShow for our three-dimensional (3D) graphics design.


Microsoft DirectShow is a 3D graphics on Microsoft Windows platform. DirectShow is a multimedia framework to perform different operations with media files or streams. It is based on Microsoft Windows Component Object Model (COM) framework and provides a common interface for media across many programming languages. It detects, use video and audio acceleration hardware automatically.

DirectShow simplifies media playback, format conversion and capturing. It also provides underlying stream control architecture for applications for custom solutions. So own DirectShow components can be created for own custom use. DirectShow provide many components for most application. However, programmers extend the use of DirectShow by writing their own components. As mentioned, DirectShow is based on COM, so to write DirectShow component and application, learn COM programming is a must for DirectShow programming.

DirectShow uses modular architecture. In each stage, COM objects called filters are used. Connection is used to communicate within the filters. The connection points are called pins. Pins are used to transfer data from one filter to another. A set of filter is connected and forms a filter graph.

Filter Graph Manager is used to control the filters in a filter graph. Filter Graph Manager is used to (1) coordinating state changes among the filters, (2) doing events communication and (3) providing methods to build filter graph.


Our group used DirectShow and Filter Graph Manager to do the image processing part, including camera capturing, video rendering and image processing for the application by building filter graph. It provides components for camera capturing and video rendering. However, our group needs to design the filter for pattern recognizing and image flattening.


The filters of the DirectShow are made by COM objects. The process is passive since variables cannot be accessed directly by the application. The variables must be passed and retrieved by calling functions in the interfaces of the filters. This makes the design and implementation of the application becomes more difficult.


Our application use Filter Graph Manager to capture screen, perform calibration and control running of the system.


IGraphBuilder interface provides methods to build a filter graph. The Filter Graph Manager implements this interface. The interface inherits from the IFilterGraph interface, which provides basic operations such as adding a filter to the graph or connecting two pins. The Filter Graph Manager can then select filters that have been registered on the user’s system, add to the graph and connect the filters.


ICaptureGraphBuilder2 interface builds capture graphs and other custom filter graphs. Capture graph are normally more complicated to build, but with the help of ICaptureBuilder2 interface, the building of capture graph becomes easier.

To do the capturing, an object of Filter Graph Manager and an object of Capture Graph Builder, which implemented by IGraphBuilder interface and ICaptureGraphBuilder2 interface respectively, are first created. The Capture Graph Builder is then initialzed by setting its pointer to the Filter Graph Manager.


IMediaControl interface is to control the flow of data through the filter graph. The methods Run, Pause and Stop are used in our application. The Filter Graph Manager implements this interface.


IMediaEvent interface is to retrieve event notifications and override the Filter Graph Manager’s default handling of events. The interface handles events such as end of stream and rendering error.

The mechanism of error handling is implemented in the Filter Graph Manager and therefore it is overridden.


IMediaSample interface sets and retrieves properties on media samples. A media sample contains a block of media data and the data are used by the shared memory buffers among filters.

Filters use this interface to set properties on samples and deliver the samples to a downstream filter. The downstream filter uses the interface to retrieve and read the data. The filter can modify the data and pass to downstream filter.


OpenCV is Open Source Computer Vision Library. It originally developed by Intel. The library is cross-platform and runs on Windows, Linux and Mac. It focuses mainly on real-time image processing.

Example applications of the OpevCV library are Human-Computer interface (HCI); Object Identification; Segmentation and Recognition; Face Recognition; Gesture Recognition; Motion Tracking; Ego-Motion; Motion Understanding; Structure From Motion (SFM) and Mobile Robotics.


OpenCV provides functions of real-time image processing. By using the OpenCV functions like cvCreateImage, cvLoadImage and cvResize, the efficiency of programming increases.


The RGB pixel arrangement in DirectShow buffer is different from that in OpenCV buffer. Therefore, modify the frames and extra transformation is needed when transferring the RGB arrangement from DirectShow to OpenCV. The transformation causes delay and certain delay should not occur in a real-time system.


When loading the image file into our application, load image function is called. Correlation function is called to find the positions of the points for finding the relative position of the screen and the camera buffer. Also, matrix multiplication function is used for rescaling.


ARToolKit is a software library for building Augmented Reality (AR) application, which involves the overlay of virtual imagery on the real world.

One of the most difficult parts of developing an Augmented Reality application is precisely calculating the user's viewpoint in real time so that the virtual images are exactly aligned with real world objects. ARToolKit uses computer vision techniques to calculate the real camera position and orientation relative to marked cards. This allows the overlay of virtual objects onto these cards.


ARToolKit applications allow virtual imagery to be superimposed over live video of the real world. The secret is in the black squares used as tracking markers. The ARToolKit tracking works as follows:

1. The camera captures video of the real world and sends it to the computer. 2. Software on the computer searches through each video frame for any square shapes. 3. If a square is found, the software uses some mathematics to calculate the position of the camera relative to the black square. 4. Once the position of the camera is known a computer graphics model is drawn from that same position. 5. This model is drawn on top of the video of the real world and so appears stuck on the square marker. 6. The final output is shown back in the handheld display, so when the user looks through the display they see graphics overlaid on the real world.


Our application needs a real-time augmented reality library to do the pattern tracking. ARToolKit provides simple framework for creating real-time augmented reality applications and it overlays 3D virtual objects on real markers based on computer vision algorithm. It provides fast marker tracking. The markers patterns are extensible and our group can develop own designed markers patterns. It also supports multiple markers recognition.

ARToolKit supports Windows platform and multiple input sources (Digital Camera, Web Camera). Moreover, it provides simple and modular API in C and complete set of samples and utilities. This makes our group more easily to study the use of this library.


ARToolKit render fast based on OpenGL, but not on DirectShow. Our group has to build the markers patterns tracking inside the filter. This costs much time on the transformation.

Originally, the Rendering Coordinate System of OpenGL is right-handed coordinate system with the camera is facing in the direction of –Z. ARToolKit obeys with OpenGL. However, DirectShow uses left-handed coordinate system and therefore our group has to transformation of the coordinate system.


ARToolKit is built in the 2-Input Video Overlay Mixer Filter and Null In Place Filter.

In the Mixer Filter, ARToolKit is responsible for markers pattern tracking. It first sets up the initial camera parameters by running the following four functions.

int arParamLoad (const char *filename, int num, ARParam *param, …) Load the camera intrinsic parameters

int arParamChangeSize (ARParam *source, int xsize, int ysize, ARParam *newparam) Change the camera size parameters

int arInitCparam (ARParam *param) Initialize camera parameters

int arParamDisp (ARParam *param) Display parameters

The second part is to load the markers pattern into the filter by the following function.

int arLoadPatt (const char *filename) Load markers description from a file

The third part is to detect the square markers from thre video input frame by the following function.

int arDetectMarker (ARUint8 *dataPtr, int thresh, ARMarkerInfo **marker_info, int *marker_num) Main function to detect the square markers in the video input frame

Then, the filter pass the number of markers found and the found markers information to the application. The application passes the two variables to the Null In place filter.

In Null In Place Filter, the variables are calculated and place the picture of tiles back by DirectShow.






We would like to express our thanks to our final year project supervisor, Professor Irwin King. He has provided us unlimited support and suggestions for the Project.

We would like to thank Mr. Edward Yau, Research and Applied Technology Development manager at Video Over InternEt and Wireless (VIEW) Technologies Laboratory, and his laboratory to provide many technical support and ideas in our project.

We would also like to thank for Au Park Shing and Leung Chun Ming, 2006/2007 Final Year Project of Tele-Table for Interactive Games. They provide the basic Tele-Table so that we could do further implementation easily.


fyp/report.txt · Last modified: 2008/04/07 14:24 (external edit)     Back to top