Data fusion is a general term used to describe techniques for combining multiple sources of information, often from highly disparate sources, to produce a single ‘best estimate’ or ‘best decision’. This project aims to identify the general principles of data fusion and to make practical suggestions as to how these could be applied either when analysing data within TRL or to develop tools and processes to allow our clients to make best use of their data.
This report takes a broad view of the most common techniques currently employed to fuse disparate sources of data. It is found that, whilst there are a wide variety of methods in use, the scientific logic that underpins them (or is used to justify them) can generally be expressed in terms of relatively simple probability theory.
The idea of constructing a data fusion method for inter-urban traffic data that is based solely on the logic of probability theory is explored and found to have potential. An example of fusion between journey time data from GPS-equipped vehicles and Automated Number Plate Recognition cameras shows that the fused journey time is considerably closer to the benchmark (from MIDAS inductive loops) than that from either of the individual sources. Although the fundamental principles are established, this method is not fully developed here. There are a number of unresolved issues surrounding how one should optimally parameterise the system and how, in the absence of ubiquitous data, one can reliably determine the requisite distribution functions which describe the relationships between variables.

Want to know more about this project?