COSPHERE TRIAL DATA Cellular, 802.11 and Bluetooth network visibility on personal mobile devices Arjan Peddemors, Novay, 2009 Contact information available at http://www.novay.nl/ DESCRIPTION The CoSphere trial gathered network traces on the personal mobile devices of 12 trial participants over a period of approximately one month in the February/March 2007 time frame. The dataset contains information from three different wireless network interfaces - cellular, 802.11 and Bluetooth - and provides insight into the richness and dynamics of the visibility of wireless networks from a user oriented perspective. The dataset contains the following visibility information: - the cellular operator name, operator number, location area code (LAC), and cell id (CID) of the associated base station (monitored continuously, storing a new base station when the association changes) - the cellular operator name and operator number of all in-range cellular networks (scanned once every 5 minutes) - the service set identifier (SSID), the basic service set identifier (BSSID) and the association state of all in range 802.11 access points (monitored during 1 minute, every 10 minutes) - the Bluetooth device address (BD_ADDR) and device name of all in range Bluetooth nodes (scanned once every 5 minutes) - the remaining battery power percentage - device attachment to AC power SOFTWARE The traces were collected using the Network Abstraction Layer (NAL) software available at http://cosphere.novay.nl/nal/ . SANITIZATION One participant experienced several clock resets to the extent that it was not possible to restore the chronological order of the logged events. The traces of this participant are not included in the dataset, reducing the total number of participants to 11. To safeguard the privacy of the participants, the data is anonymized in the following way. All references to cellular operator names, operator numbers, location area codes, cell ids, SSIDs, BSSIDs, BD_ADDRs are replaced with randomized identifiers. The mapping of the real identifiers to the anonymized identifiers is consistently over all participants, so that, for example, it is possible to determine whether two participants have been attached to the same 802.11 access point. DATA FORMAT The traces are stored in files of the Attribute-Relation File Format (ARFF, see http://www.cs.waikato.ac.nz/~ml/weka/arff.html ), in sparse form. Every line describes the visibility of cellular, 802.11 and Bluetooth networks at a specific time. Lines are not printed at regular intervals, but only when the current visibility state differs from the state described in the previous line. After a scan on a network interface, the visibility state is taken to be constant until the next scan, which means that for 802.11 a change may occur every 10 minutes (more often when the user interacts with the device) and for Bluetooth a change may occur every 5 minutes. The sparse .arff file starts with a section describing the 'attributes', followed by a 'data' section describing the value of these attributes. The sparse notation assigns a default value of 0 (zero) to attributes not specified on a data line. We use binary values to indicate whether a base station, access point or node is in range. Example .arff file: @RELATION visibility @ATTRIBUTE timestamp date % attr 0 @ATTRIBUTE after-reset {0, 1} % attr 1 @ATTRIBUTE batterylevel NUMERIC % attr 2 @ATTRIBUTE on-ac {0, 1} % attr 3 @ATTRIBUTE "cid-copname01-copnumber01-lac1398-cid82300" {0, 1} % attr 4 @ATTRIBUTE "cid-copname01-copnumber01-lac1398-cid23897" {0, 1} % attr 5 @ATTRIBUTE "wlap-inrange-ssid58034-bssid27992" {0, 1} % attr 6 @DATA {0 2007-03-10T16:08:58, 1 1, 2 81, 4 1, 6 1} {0 2007-03-10T16:10:46, 2 80, 4 1} {0 2007-03-10T16:12:06, 2 80, 5 1} The timestamp, afterreset, batterylevel and on-ac attributes (numbers 0-3) are in every .arff file. The rest of the attributes refer to the visibility of network resources, using the following naming scheme: - cop-copname-copnumber Cellular operator - cid-copname-copnumber-lac-cid Currently associated cell - wlap-inrange-ssid-bssid In range 802.11 access point - wlap-assoc-ssid-bssid Currently associated 802.11 access point - btnw-bdaddr In range Bluetooth node These attributes vary per participant. When the afterreset attribute is set the device was restarted in between the time of the last line and the time of the current line. For convenience, we have included a Python script to transform the .arff files into .txt files in which the full names of visible network entities are used (tested under Python version 2.5.1). Running > python arfftotxt.py */*.arff from the directory of this README file will generate the .txt files. LICENSE (adapted from Rice data license @ crawdad) 1. We grant you a nonexclusive, nontransferable license to use the data and/or code for commercial, educational, and/or research purposes only. You agree to not redistributed the data/code without our previous express written approval. 2. The traces we provide are anonymized. To respect the privacy of those human subjects whose activity is captured by the data, you will not attempt to reverse the anonymization process. This may include but is not limited to identifying specific cell ids, access points, node addresses, the actual users, or their location. 3. You agree to acknowledge the source of the data, i.e., the MobilityModels'08 paper or one of our more recent papers describing the data. THIS DATA IS PROVIDED ON AN "AS IS" BASIS, WITHOUT ANY WARRANTY OR IMPLIED FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL WE BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES