Tree 2.0.3

Posted on by
Latest version

Released:

Editable interval tree data structure for Python 2 and 3

2.0.3 (2017.10.10). Tree structure is restored when Firefox is started with some extra URLs (or files.). Sidebar UI is now rendered with the system font for message boxes (same to Firefox's tabs). 'Max level of indentation' config works more correctly (including `0` case.). Focus redirection for closing current tab works more correctly.

  • This package provides Binary- RedBlack- and AVL-Trees written in Python and Cython/C. This Classes are much slower than the built-in dict class, but all iterators/generators yielding data in sorted key order. Trees can be uses as drop in replacement for dicts in most cases.
  • Happy Tree Friends - Happy Trails to You Part 2. Happy Tree Friends - Snip Snip Hooray. Happy Tree Friends - Happy Trails Part 1. How To Be A Graceful Girl. ToyBox Rally. Happy Tree Friends - Mime and Mime Again. Dress The Duelist (Twisted) Cooking Mama.
  • The weeping beech, Fagus sylvatica 'Pendula', is a cultivar of the deciduous European beech. citation neededPhysical description. The weeping beech is characterized by its shape with sweeping, pendulous branches. The trunk of the tree may not be visible from a distance due to the presence of the covering 'weeping' branches. Branches may reach the ground and start new roots again.

Project description

Tree 2007

A mutable, self-balancing interval tree for Python 2 and 3. Queries may be by point, by range overlap, or by range envelopment.

This library was designed to allow tagging text and time intervals, where the intervals include the lower bound but not the upper bound.

Version 3 changes!

  • The search(begin, end, strict) method no longer exists. Instead, use one of these:
    • at(point)
    • overlap(begin, end)
    • envelop(begin, end)
  • The extend(items) method no longer exists. Instead, use update(items).
  • Methods like merge_overlaps() which took a strict argument consistently default to strict=True. Before, some methods defaulted to True and others to False.

Installing

Features

  • Supports Python 2.7 and Python 3.4+ (Tested under 2.7, and 3.4 thru 3.7)

  • Initializing

    • blank tree = IntervalTree()
    • from an iterable of Interval objects (tree = IntervalTree(intervals))
    • from an iterable of tuples (tree = IntervalTree.from_tuples(interval_tuples))
  • Insertions

    • tree[begin:end] = data
    • tree.add(interval)
    • tree.addi(begin, end, data)
  • Deletions

    • tree.remove(interval) (raises ValueError if not present)
    • tree.discard(interval) (quiet if not present)
    • tree.removei(begin, end, data) (short for tree.remove(Interval(begin, end, data)))
    • tree.discardi(begin, end, data) (short for tree.discard(Interval(begin, end, data)))
    • tree.remove_overlap(point)
    • tree.remove_overlap(begin, end) (removes all overlapping the range)
    • tree.remove_envelop(begin, end) (removes all enveloped in the range)
  • Point queries

    • tree[point]
    • tree.at(point) (same as previous)
  • Overlap queries

    • tree[begin:end]
    • tree.overlap(begin, end) (same as previous)
  • Envelop queries

    • tree.envelop(begin, end)
  • Membership queries

    • interval_obj in tree (this is fastest, O(1))
    • tree.containsi(begin, end, data)
    • tree.overlaps(point)
    • tree.overlaps(begin, end)
  • Iterable

    • for interval_obj in tree:
    • tree.items()
  • Sizing

    • len(tree)
    • tree.is_empty()
    • not tree
    • tree.begin() (the begin coordinate of the leftmost interval)
    • tree.end() (the end coordinate of the rightmost interval)
  • Set-like operations

    • union

      • result_tree = tree.union(iterable)
      • result_tree = tree1 tree2
      • tree.update(iterable)
      • tree = other_tree
    • difference

      • result_tree = tree.difference(iterable)
      • result_tree = tree1 - tree2
      • tree.difference_update(iterable)
      • tree -= other_tree
    • intersection

      • result_tree = tree.intersection(iterable)
      • result_tree = tree1 & tree2
      • tree.intersection_update(iterable)
      • tree &= other_tree
    • symmetric difference

      • result_tree = tree.symmetric_difference(iterable)
      • result_tree = tree1 ^ tree2
      • tree.symmetric_difference_update(iterable)
      • tree ^= other_tree
    • comparison

      • tree1.issubset(tree2) or tree1 <= tree2
      • tree1 <= tree2
      • tree1.issuperset(tree2) or tree1 > tree2
      • tree1 >= tree2
      • tree1 tree2
  • Restructuring

    • chop(begin, end) (slice intervals and remove everything between begin and end, optionally modifying the data fields of the chopped-up intervals)
    • slice(point) (slice intervals at point)
    • split_overlaps() (slice at all interval boundaries, optionally modifying the data field)
    • merge_overlaps() (joins overlapping intervals into a single interval, optionally merging the data fields)
    • merge_equals() (joins intervals with matching ranges into a single interval, optionally merging the data fields)
  • Copying and typecasting

    • IntervalTree(tree) (Interval objects are same as those in tree)
    • tree.copy() (Interval objects are shallow copies of those in tree)
    • set(tree) (can later be fed into IntervalTree())
    • list(tree) (ditto)
  • Pickle-friendly

  • Automatic AVL balancing

Examples

  • Getting started

  • Adding intervals - any object works!

  • Query by point

    The result of a query is a set object, so if ordering is important,you must sort it first.

  • Query by range

    Note that ranges are inclusive of the lower limit, but non-inclusive of the upper limit. So:

    Since our search was over 2 ≤ x < 4, neither Interval(1, 2) nor Interval(4, 7)was included. The first interval, 1 ≤ x < 2 does not include x = 2. The secondinterval, 4 ≤ x < 7, does include x = 4, but our search interval excludes it. So,there were no overlapping intervals. However:

    To only return intervals that are completely enveloped by the search range:

  • Accessing an Interval object

  • Constructing from lists of intervals

    We could have made a similar tree this way:

    Or, if we don't need the data fields:

    Or even:

  • Removing intervals

    We could also empty a tree entirely:

    Or remove intervals that overlap a range:

    We can also remove only those intervals completely enveloped in a range:

  • Chopping

    We could also chop out parts of the tree:

    To modify the new intervals' data fields based on which side of the interval is being chopped:

  • Slicing

    You can also slice intervals in the tree without removing them:

    You can also set the data fields, for example, re-using datafunc() from above:

Future improvements

See the issue tracker on GitHub.

Based on

  • Eternally Confuzzled's AVL tree
  • Wikipedia's Interval Tree
  • Heavily modified from Tyler Kahn's Interval Tree implementation in Python (GitHub project)
  • Incorporates contributions from:
    • konstantint/Konstantin Tretyakov of the University of Tartu (Estonia)
    • lmcarril/Luis M. Carril of the Karlsruhe Institute for Technology (Germany)

Copyright

  • Chaim Leib Halbert, 2013-2018
  • Modifications, Konstantin Tretyakov, 2014

Licensed under the Apache License, version 2.0.

The source code for this project is at https://github.com/chaimleib/intervaltree

Release historyRelease notifications

3.0.2

3.0.1

3.0.0

2.1.0

2.0.4

2.0.3

2.0.2

Tree

2.0.1

2.0.0

Tree 20 Million Website

1.1.1

1.1.0

1.0.2

1.0.1

1.0.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for intervaltree, version 3.0.2
Filename, sizeFile typePython versionUpload dateHashes
Filename, size intervaltree-3.0.2.tar.gz (30.7 kB) File type Source Python version None Upload dateHashes
Close

Hashes for intervaltree-3.0.2.tar.gz

Hashes for intervaltree-3.0.2.tar.gz
AlgorithmHash digest
SHA256cb4f61c81dcb4fea6c09903f3599015a83c9bdad1f0bbd232495e6681e19e273
MD54159801f23d2d9eb98db6c60f9c3b665
BLAKE2-256e8f976237755b2020cd74549e98667210b2dd54d3fb17c6f4a62631e61d31225

In computer science, the Bx tree is basically a query that is used to update efficient B+ tree-based index structures for moving objects.

Index structure[edit]

The base structure of the Bx-tree is a B+ tree in which the internal nodes serve as a directory, each containing a pointer to its right sibling. In the earlier version of the Bx-tree,[1] the leaf nodes contained the moving-object locations being indexed and corresponding index time. In the optimized version,[2] each leaf node entry contains the id, velocity, single-dimensional mapping value and the latest update time of the object. The fanout is increased by not storing the locations of moving objects, as these can be derived from the mapping values.

Utilizing the B+ tree for moving objects[edit]

An example of the Bx-tree with the number of index partitions equal to 2 within one maximum update interval tmu. In this example, there are maximum three partitions existing at the same time. After linearization, object locations inserted at time 0 are indexed in partition 0 with label timestamp 0.5 tmu, object locations updated during time 0 to 0.5 tmu are indexed in partition 1 with label timestamp tmu, and so on (as indicated by arrows). As time elapses, repeatedly the first range expires (shaded area), and a new range is appended (dashed line).

As for many other moving objects indexes, a two-dimensional moving object is modeled as a linear function as O = ((x, y), (vx, vy), t ), where (x, y) and (vx, vy) are location and velocity of the object at a given time instance t, i.e., the time of last update. The B+ tree is a structure for indexing single-dimensional data. In order to adopt the B+ tree as a moving object index, the Bx-tree uses a linearization technique which helps to integrate objects' location at time t into single dimensional value. Specifically, objects are first partitioned according to their update time. For objects within the same partition, the Bx-tree stores their locations at a given time which are estimated by linear interpolation. By doing so, the Bx-tree keeps a consistent view of all objects within the same partition without storing the update time of an objects.

Secondly, the space is partitioned by a grid and the location of an object is linearized within the partitions according to a space-filling curve, e.g., the Peano or Hilbert curves.

Finally, with the combination of the partition number (time information) and the linear order (location information), an object is indexed in Bx-tree with a one-dimensional index key Bxvalue: Aiseesoft dvd creator 5.2.8.

Bxvalue(O,t)=[indexpartition]2+[xrep]2{displaystyle B^{x}valueleft(O,tright)=left[indexpartitionright]_{2}+left[xrepright]_{2}}

Here index-partition is an index partition determined by the update time and xrep is the space-filling curve value of the object position at the indexed time, [X]2{displaystyle left[Xright]_{2}} denotes the binary value of x, and “+” means concatenation.

Given an object O ((7, 2), (-0.1,0.05), 10), tmu = 120, the Bxvalue for O can be computed as follows.

  1. O is indexed in partition 0 as mentioned. Therefore, indexpartition = (00)2.
  2. O’s position at the label timestamp of partition 0 is (1,5).
  3. Using Z-curve with order = 3, the Z-value of O, i.e., xrep is (010011)2.
  4. Concatenating indexpartition and xrep, Bxvalue (00010011)2=19.
  5. Example O ((0,6), (0.2, -0.3 ),10) and tmu=120 then O's position at the label timestamp of partition: ???

Insertion, update and deletion[edit]

Given a new object, its index key is computed and then the object is inserted into the Bx-tree as in the B+ tree. An update consists of a deletion followed by an insertion. An auxiliary structure is employed to keep the latest key of each index so that an object can be deleted by searching for the key. The indexing key is computed before affecting the tree. In this way, the Bx-tree directly inherits the good properties of the B+ tree, and achieves efficient update performance.

Queries[edit]

Range query[edit]

A range query retrieves all objects whose location falls within the rectangular range q=([qx1,qy1];[qx2;qy2]){displaystyle q=left(left[qx1,qy1right];left[qx2;qy2right]right)} at time tq{displaystyle tq} not prior to the current time.

The Bx-tree uses query-window enlargement technique to answer queries. Since the Bx-tree stores an object's location as of sometime after its update time, the enlargement involves two cases: a location must either be brought back to an earlier time or forward to a later time. The main idea is to enlarge the query window so that it encloses all objects whose positions are not within query window at its label timestamp but will enter the query window at the query timestamp.

After the enlargement, the partitions of the Bx-tree need to be traversed to find objects falling in the enlarged query window. In each partition, the use of a space-filling curve means that a range query in the native, two-dimensional space becomes a set of range queries in the transformed, one-dimensional space.[1]

To avoid excessively large query region after expansion in skewed datasets, an optimization of the query algorithm exists,[3] which improves the query efficiency by avoiding unnecessary query enlargement.

K nearest neighbor query[edit]

K nearest neighbor query is computed by iteratively performing range queries with an incrementally enlarged search region until k answers are obtained. Another possibility is to employ similar querying ideas in The iDistance Technique.

Other queries[edit]

The range query and K Nearest Neighbor query algorithms can be easily extended to support interval queries, continuous queries, etc.[2]

Adapting relational database engines to accommodate moving objects[edit]

Tree 20 Million

Since the Bx-tree is an index built on top of a B+ tree index, all operations in the Bx-tree, including the insertion, deletion and search, are the same as those in the B+ tree. There is no need to change the implementations of these operations. The only difference is to implement the procedure of deriving the indexing key as a stored procedure in an existing DBMS. Therefore, the Bx-tree can be easily integrated into existing DBMS without touching the kernel.

SpADE[4] is moving object management system built on top of a popular relational database system MySQL, which uses the Bx-tree for indexing the objects. In the implementation, moving object data is transformed and stored directly on MySQL, and queries are transformed into standard SQL statements which are efficiently processed in the relational engine. Most importantly, all these are achieved neatly and independently without infiltrating into the MySQL core.

Performance tuning[edit]

Potential problem with data skew[edit]

The Bx tree uses a grid for space partitioning while mapping two-dimensional location into one-dimensional key. This may introduce performance degradation to both query and update operations while dealing with skewed data. If grid cell is oversize, many objects are contained in a cell. Since objects in a cell are indistinguishable to the index, there will be some overflow nodes in the underlying B+ tree. The existing of overflow pages not only destroys the balancing of the tree but also increases the update cost. As for the queries, for the given query region, large cell incurs more false positives and increases the processing time. On the other hand, if the space is partitioned with finer grid, i.e. smaller cells, each cell contains few objects. There is hardly overflow pages so that the update cost is minimized. Fewer false positives are retrieved in a query. However, more cells are needed to be searched. The increase in the number of cells searched also increases the workload of a query.

Index tuning[edit]

The ST2B-tree [5] introduces a self-tuning framework for tuning the performance of the Bx-tree while dealing with data skew in space and data change with time. In order to deal with data skew in space, the ST2B-tree splits the entire space into regions of different object density using a set of reference points. Each region uses an individual grid whose cell size is determined by the object density inside of it.

The Bx-tree have multiple partitions regarding different time intervals. As time elapsed, each partition grows and shrinks alternately. The ST2B-tree utilizes this feature to tune the index online in order to adjust the space partitioning to make itself accommodate to the data changes with time. In particular, as a partition shrinks to empty and starts growing, it chooses a new set of reference points and new grid for each reference point according to the latest data density. The tuning is based on the latest statistics collected during a given period of time, so that the way of space partitioning is supposed to fit the latest data distribution best. By this means, the ST2B-tree is expected to minimize the effect caused by data skew in space and data changes with time.

See also[edit]

Tre 203

References[edit]

  1. ^ abChristian S. Jensen, Dan Lin, and Beng Chin Ooi. Query and Update Efficient B+tree based Indexing of Moving Objects. In Proceedings of 30th International Conference on Very Large Data Bases (VLDB), pages 768-779, 2004.
  2. ^ abDan Lin. Indexing and Querying Moving Objects Databases, PhD thesis, National University of Singapore, 2006.
  3. ^Jensen, C.S., D. Tiesyte, N. Tradisauskas, Robust B+-Tree-Based Indexing of Moving Objects, in Proceedings of the Seventh International Conference on Mobile Data Management, Nara, Japan, 9 pages, May 9–12, 2006.
  4. ^SpADEArchived 2009-01-02 at the Wayback Machine: A SPatio-temporal Autonomic Database Engine for location-aware services.
  5. ^Su Chen, Beng Chin Ooi, Kan-Lee. Tan, and Mario A. Nacismento, ST2B-tree: A Self-Tunable Spatio-Temporal B+-tree for Moving ObjectsArchived 2011-06-11 at the Wayback Machine. In Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD), page 29-42, 2008.

Tree 2012

Retrieved from 'https://en.wikipedia.org/w/index.php?title=Bx-tree&oldid=902381162'