Projects

yefan.projs.html

Sensor Network Research

Data Forwarding:

Gradient Broadcast (GRAB) in sensor networks

Abstract:

GRAB addresses the reliable deliver of sensing data through a vast field of small, vulnerable sensors which may fail frequently and communicate via lossy wireless links. We propose a new set of mechanisms and protocols which is designed specifically for robust data delivery for large scale sensor networks in a harsh or even hostile environment. GRAB first builds and maintains a cost field, providing each sensor the direction to forward sensing data. It then forwards data along a band of interleaved mesh from each source to the receiver. GRAB controls the width of the band by the amount of credit carried in each data message, allowing the sender to adjust the robustness of data delivery.  The design harnesses the advantage of large scale and relies on the collective efforts of multiple nodes to deliver data, without dependency on any individual ones. We have evaluated the GRAB performance through both analysis and extensive simulation. Our analysis shows quantitatively the advantage of interleaved mesh over multiple parallel paths. Our simulation further confirms the analysis results and shows that GRAB can successfully deliver over 90% of packets over 30 hops with relatively low energy cost,  even under the adverse conditions of 30% node failures compounded with 15% link message losses.
 

Idea:

- Build a cost field for the sink. Each node keeps a cost of sending a packet to the sink via some path (ideally the minimum cost path). All costs of the nodes form the cost field. If we imagine that each node is elevated to a height equivalent to its cost, the whole cost field would look like a funnel: the sink sits at the bottom, while nodes farther to the sink are "higher".

- Each packet carries a "credit", which is some "extra" budget that can be used to deliver a packet over a path. The total cost over the path must be smaller or equal to the total budget. Thus by controlling the amount of credit a packet carries, we effectively control the redundancy in the forwarding paths. These paths interleave and form a forwarding "mesh". The credit is the means to trade off between robustness and total cost.

Publications:

GRAB's receiver based forwarding is a precursor of opportunistic routing. This work has been cited 127 times (all citations from http://scholar.google.com/). The previous webpage for GRAB is here.
 

Two-tier Data Dissemination (TTDD)

Abstract:

Sink mobility brings new challenges to large-scale sensor networking. It suggests that information about each mobile sink's location be continuously propagated through the sensor field to keep all sensor nodes updated with the direction of forwarding future data reports. Unfortunately frequent location updates from multiple sinks can lead to both excessive drain of sensors' limited battery power supply and increased collisions in wireless transmissions.  We propose TTDD, a Two-Tier Data Dissemination approach that provides scalable and efficient data delivery to multiple mobile sinks. Each data source in TTDD proactively builds a grid structure which enables mobile sinks to continuously receive data on the move by flooding queries within a local cell only. TTDD's design exploits the fact that sensor nodes are stationary and location-aware to construct and maintain the grid structures with low overhead. We have evaluated TTDD performance through both analysis and extensive simulation experiments. Our results show that TTDD handles multiple mobile sinks efficiently with performance comparable with that of stationary sinks.
 

Idea:

- Each source builds a virtual square grid in the network and recruits a set of nodes at grid crossings as its dissemination nodes.  A mobile sink locally floods a query within a range of a grid cell size to location nearby dissemination nodes.

- Queries are forwarded along the grid towards the source. Data flow back in reverse direction.  Thus the impact of sinks' mobility is confined to a local area of about a cell size.

Publications

This work has been cited over 300 times.


Energy Efficiency

Probing Environment and Adaptive Sleeping (PEAS)

Abstract:

PEAS is a randomized energy-conservation protocol that seeks to build resilient sensor networks in the presence of frequent, unexpected node failures.  PEAS extends the network lifetime by maintaining a necessary set of working nodes and turning off redundant ones, which wake up after randomized sleeping times and replace failed ones when needed. Different from the common practice of maintaining local neighborhood topology at each node to achieve distributed coordination, PEAS has only constant state and does not require per neighbor state at any node. Each node operates based on one single piece of information. This allows PEAS to scale to very dense node deployment. PEAS is highly robust against node failures due to its simple operations and randomized design. It also ensures asymptotic connectivity.  Our simulations and analysis show that PEAS can maintain an adequate working node density in presence of as high as 38% node failures, and a roughly constant overhead of less than 1% of the total energy consumption under various deployment densities. As a result, PEAS extends a sensor network's functioning time in linear proportional to the deployed sensor population.
 

Idea:

- There's only one simple rule about which nodes can work: the distance between two neighboring working nodes should be at least Rp, which is a system parameter that determines the density of working nodes. Rp should be less than the maximum transmission range Rt to ensure connectivity of working nodes.

- After deployment, each node first sleeps for an exponentially distributed time. When a node wakes up, it broadcasts a PROBE message within Rp. If any working nodes is within Rp, it sends back a REPLY (also within Rp). Upon hearing a REPLY, the probing node sleeps again, for another dynamically adjusted period of time; If it doesn't hear any REPLY, it starts working, until it uses up energy or fails.

Publications

This work has been cited over 300 times.

Security:

My research on sensor network security provides three progressively more powerful solutions for handling injected false data. SEF is the first work that addresses the problem. It has a threshold weakness, solved by the second work, LBRS. These two are passive in that they still allow the attacker to continue the attack. The third work (PNM) is a step further to provideactive fightback mechanism.

Statistical En-route Filtering of Injected False Data (SEF)

Abstract:

In a large-scale sensor network individual sensors are subject to security compromises. A compromised node can inject bogus sensing reports into the network. If undetected, these bogus reports would be forwarded to the data collection point (i.e. the sink). Such attacks by compromised sensors can cause not only false alarms but also the depletion of the finite amount of energy in a battery powered network. In this paper we present a Statistical En-route Filtering (SEF) mechanism that can detect and drop such false reports. SEF requires that each sensing report be validated by multiple keyed message authentication codes (MACs), each generated by a node that detects the same event. As the report is forwarded, each node along the way verifies the correctness of the MACs probabilistically and drops those with invalid MACs at earliest points. The sink further filters out remaining false reports that escape the en-route filtering. SEF exploits the network scale to determine the truthfulness of each report through collective decision-making by multiples detecting nodes and collective false-report-detection by multiple forwarding nodes. Our analysis and simulations show that, with an overhead of 14 bytes per report, SEF is able to drop 80~90% falsely injected reports by a compromised node within 10 forwarding hops.
 

Idea:

- Divide a gloal key pool into n non-overlapping partitions.Each node pick k keys randomly from one of the n partitions. Thus only two nodes picking keys from the same partition has certain probability of sharing keys.
- Multiple detecting nodes each generates a keyed Message Authentication Code (MAC) and sends to an elected CoS (Center of Stimulus) node. CoS attaches the MACs to the final report - Each forwarding node has probability of possessing one of the keys used in the attached MACs, thus verifying the correctness and dropping those reports forged by attackers

Publications

This work has been cited over 120 times.
       

Toward Resilient Security in Wireless Sensor Networks

Abstract:

Node compromise poses severe security threats in wireless sensor networks. Unfortunately, existing security designs can address only a small, fixed threshold number of compromised nodes; the security protection completely breaks down when the threshold is exceeded. In this paper, we seek to overcome the threshold limitation and achieve resiliency against an increasing number of compromised nodes. To this end, we propose a novel location-based approach in which the secret keys are bound to geographic locations, and each node stores a few keys based on its own location. The location-binding property constrains the scope for which individual keys can be (mis)used, thus limiting the damages caused by a collection of compromised nodes. We illustrate this approach through the problem of report fabrication attacks, in which the compromised nodes forge non-existent events. We evaluate our design through extensive analysis, implementation and simulations, and demonstrate its graceful performance degradation in the presence of an increasing number of compromised nodes.

Publications

This work has been cited over 30 times.

Catching "Moles" in  Sensor Networks

Abstract:

False data injection is a severe attack that compromised sensor nodes (``moles,'' spies who operate from within an organization, especially agents operating against their own governments. We use it to refer to compromised sensor nodes.}) can launch. These moles inject large amount of bogus traffic that can lead to application failures and exhausted network resources. Existing sensor network security proposals only passively mitigate the damage by filtering injected packets; they do not provide active means for fight back. This paper studies how to locate such moles within the framework of packet marking, when forwarding moles collude with source moles to manipulate the marks. Existing Internet traceback mechanisms do not assume compromised forwarding nodes and are easily defeated by manipulated marks. We propose a Probabilistic Nested Marking (PNM) scheme that is secure against such colluding attacks. No matter how colluding moles manipulate the marks, PNM can always locate them one by one.  We prove that nested marking is both sufficient and necessary to resist colluding attacks. PNM also has fast-traceback: within about 50 packets, it can track down a mole up to 20 hops away from the sink. This virtually prevents any effective data injection attack: moles will be caught before they have injected any meaningful amount
of bogus traffic.

Publications

  • Fan Ye, Hao Yang, Zhen Liu, ``Catching 'Moles' in Sensor Networks'', IEEE International Conference on Distributed Computing Systems (ICDCS),  Toronto, Canada, June 2007, (acceptance ratio 13.4% 71/528)



Stream Processing Systems Research

CLASP: CoLlaborating, Autonomous Stream Processing Systems

Abstract:

There are currently a number of streaming data analysis systems in research or commercial operation. These systems are generally large-scale distributed systems, but each system operates in isolation, under the control of one administrative authority. We are developing middleware that permits autonomous or semi-autonomous streaming analysis systems (called ``sites'') to interoperate, providing them opportunities for data access, performance improvements, and reliability far exceeding that available in a single system.  Unique characteristics of our system include an architecture for the management of multiple cooperation paradigms depending on the degree of trust and  dependencies among the participating sites; a multisite planner that converts user-specified declarative queries into specifications of distributed jobs; and a mechanism for automatic recovery of site failures by redispatching failed pieces of a distributed job.  We evaluate our architecture via experiments on a running prototype, and the results demonstrate the advantages of multi-site cooperation: collaborative jobs that share resources, even across only a few sites, can produce results 50% faster than independent execution, and jobs on failed sites can be recovered within a few seconds.

Publications

  • Michael Branson, Fred Douglis, Brad Fawcett, Zhen Liu, Anton Riabov, Fan Ye, ``*CLASP: Collaborating, Autonomous Stream Processing Systems'', in ACM/USENIX International Middleware Conference, Newport Beach, California, November 2007 (acceptance ratio 20.6\% 22/107)
    *  Authors in alphabetical order for most IBM publications.

Heuristics for Negotiation Schedules in Multi-plan Optimization

Abstract:

In cooperating systems such as grids and collaborative streaming analysis, autonomous sites can establish ``agreements'' to arrange access to remote resources for a period of time.  The determination of which resources to reserve to accomplish a task need not be known a priori, because there exist multiple plans for accomplishing the same task and they may require access to different resources.   While these plans can be functionally equivalent, they may have different performance/cost tradeoffs and may use a variety of resources, both local and belonging to other sites.  The negotiation schedule, i.e., the order in which remote resources are negotiated, determines how quickly one plan can be selected and deployed; it also decides the utility for running the plan. This paper studies the problem of optimizing negotiation schedules in cooperative systems with multiple alternative plans. We first provide a voting-based heuristic that reduces the complexity $O(n!)$ of the exhaustive search to $O(m n^q)$. We also present a weight-based heuristic that further reduces the complexity to $O(mn)$. Experimental results show that, on average, 1) the voting-based approach achieved  6% higher utility than the weight-based approach,  2) the two proposed approaches achieved almost 50% higher utility than a randomized approach; and 3) the two proposed approaches produced 10% lower utility than that of the optimal schedule with reasonable plan sizes.

Publications

Resource Discovery in Federated Systems with Voluntary Sharing

Abstract:

Federated system is a popular paradigm for multiple organizations to collaborate and share resources for common benefits. In practice, however, these organizations are often within different administrative domains and thus demand autonomous resource management, as opposed to blindly exporting their resources for efficient search. To address these challenges, we present a new resource discovery middleware, called ROADS, that can facilitate voluntary sharing. In ROADS, the participants can associate with each other at their own will and dictate the extent of sharing by properly exporting summaries, a condensed representation of their resource records. To enable efficient search, these summaries are aggregated and replicated along an overlay-assisted server hierarchy, and the queries are routed to those servers that are likely to hold the desired resources. Our experimental results show that ROADS provides not only flexible resource sharing for federated systems but also efficient resource discovery, with performance comparable to a centrally managed system.

Publications: