Tree 2.0.3
Released:
Editable interval tree data structure for Python 2 and 3
2.0.3 (2017.10.10). Tree structure is restored when Firefox is started with some extra URLs (or files.). Sidebar UI is now rendered with the system font for message boxes (same to Firefox's tabs). 'Max level of indentation' config works more correctly (including `0` case.). Focus redirection for closing current tab works more correctly.
- This package provides Binary- RedBlack- and AVL-Trees written in Python and Cython/C. This Classes are much slower than the built-in dict class, but all iterators/generators yielding data in sorted key order. Trees can be uses as drop in replacement for dicts in most cases.
- Happy Tree Friends - Happy Trails to You Part 2. Happy Tree Friends - Snip Snip Hooray. Happy Tree Friends - Happy Trails Part 1. How To Be A Graceful Girl. ToyBox Rally. Happy Tree Friends - Mime and Mime Again. Dress The Duelist (Twisted) Cooking Mama.
- The weeping beech, Fagus sylvatica 'Pendula', is a cultivar of the deciduous European beech. citation neededPhysical description. The weeping beech is characterized by its shape with sweeping, pendulous branches. The trunk of the tree may not be visible from a distance due to the presence of the covering 'weeping' branches. Branches may reach the ground and start new roots again.
Project description
Tree 2007
A mutable, self-balancing interval tree for Python 2 and 3. Queries may be by point, by range overlap, or by range envelopment.
This library was designed to allow tagging text and time intervals, where the intervals include the lower bound but not the upper bound.
Version 3 changes!
- The
search(begin, end, strict)
method no longer exists. Instead, use one of these:at(point)
overlap(begin, end)
envelop(begin, end)
- The
extend(items)
method no longer exists. Instead, useupdate(items)
. - Methods like
merge_overlaps()
which took astrict
argument consistently default tostrict=True
. Before, some methods defaulted toTrue
and others toFalse
.
Installing
Features
Supports Python 2.7 and Python 3.4+ (Tested under 2.7, and 3.4 thru 3.7)
Initializing
- blank
tree = IntervalTree()
- from an iterable of
Interval
objects (tree = IntervalTree(intervals)
) - from an iterable of tuples (
tree = IntervalTree.from_tuples(interval_tuples)
)
- blank
Insertions
tree[begin:end] = data
tree.add(interval)
tree.addi(begin, end, data)
Deletions
tree.remove(interval)
(raisesValueError
if not present)tree.discard(interval)
(quiet if not present)tree.removei(begin, end, data)
(short fortree.remove(Interval(begin, end, data))
)tree.discardi(begin, end, data)
(short fortree.discard(Interval(begin, end, data))
)tree.remove_overlap(point)
tree.remove_overlap(begin, end)
(removes all overlapping the range)tree.remove_envelop(begin, end)
(removes all enveloped in the range)
Point queries
tree[point]
tree.at(point)
(same as previous)
Overlap queries
tree[begin:end]
tree.overlap(begin, end)
(same as previous)
Envelop queries
tree.envelop(begin, end)
Membership queries
interval_obj in tree
(this is fastest, O(1))tree.containsi(begin, end, data)
tree.overlaps(point)
tree.overlaps(begin, end)
Iterable
for interval_obj in tree:
tree.items()
Sizing
len(tree)
tree.is_empty()
not tree
tree.begin()
(thebegin
coordinate of the leftmost interval)tree.end()
(theend
coordinate of the rightmost interval)
Set-like operations
union
result_tree = tree.union(iterable)
result_tree = tree1 tree2
tree.update(iterable)
tree = other_tree
difference
result_tree = tree.difference(iterable)
result_tree = tree1 - tree2
tree.difference_update(iterable)
tree -= other_tree
intersection
result_tree = tree.intersection(iterable)
result_tree = tree1 & tree2
tree.intersection_update(iterable)
tree &= other_tree
symmetric difference
result_tree = tree.symmetric_difference(iterable)
result_tree = tree1 ^ tree2
tree.symmetric_difference_update(iterable)
tree ^= other_tree
comparison
tree1.issubset(tree2)
ortree1 <= tree2
tree1 <= tree2
tree1.issuperset(tree2)
ortree1 > tree2
tree1 >= tree2
tree1 tree2
Restructuring
chop(begin, end)
(slice intervals and remove everything betweenbegin
andend
, optionally modifying the data fields of the chopped-up intervals)slice(point)
(slice intervals atpoint
)split_overlaps()
(slice at all interval boundaries, optionally modifying the data field)merge_overlaps()
(joins overlapping intervals into a single interval, optionally merging the data fields)merge_equals()
(joins intervals with matching ranges into a single interval, optionally merging the data fields)
Copying and typecasting
IntervalTree(tree)
(Interval
objects are same as those in tree)tree.copy()
(Interval
objects are shallow copies of those in tree)set(tree)
(can later be fed intoIntervalTree()
)list(tree)
(ditto)
Pickle-friendly
Automatic AVL balancing
Examples
Getting started
Adding intervals - any object works!
Query by point
The result of a query is a
set
object, so if ordering is important,you must sort it first.Query by range
Note that ranges are inclusive of the lower limit, but non-inclusive of the upper limit. So:
Since our search was over
2 ≤ x < 4
, neitherInterval(1, 2)
norInterval(4, 7)
was included. The first interval,1 ≤ x < 2
does not includex = 2
. The secondinterval,4 ≤ x < 7
, does includex = 4
, but our search interval excludes it. So,there were no overlapping intervals. However:To only return intervals that are completely enveloped by the search range:
Accessing an
Interval
objectConstructing from lists of intervals
We could have made a similar tree this way:
Or, if we don't need the data fields:
Or even:
Removing intervals
We could also empty a tree entirely:
Or remove intervals that overlap a range:
We can also remove only those intervals completely enveloped in a range:
Chopping
We could also chop out parts of the tree:
To modify the new intervals' data fields based on which side of the interval is being chopped:
Slicing
You can also slice intervals in the tree without removing them:
You can also set the data fields, for example, re-using
datafunc()
from above:
Future improvements
See the issue tracker on GitHub.
Based on
- Eternally Confuzzled's AVL tree
- Wikipedia's Interval Tree
- Heavily modified from Tyler Kahn's Interval Tree implementation in Python (GitHub project)
- Incorporates contributions from:
- konstantint/Konstantin Tretyakov of the University of Tartu (Estonia)
- lmcarril/Luis M. Carril of the Karlsruhe Institute for Technology (Germany)
Copyright
- Chaim Leib Halbert, 2013-2018
- Modifications, Konstantin Tretyakov, 2014
Licensed under the Apache License, version 2.0.
The source code for this project is at https://github.com/chaimleib/intervaltree
Release historyRelease notifications
3.0.2
3.0.1
3.0.0
2.1.0
2.0.4
2.0.3
2.0.2
2.0.1
2.0.0
Tree 20 Million Website
1.1.1
1.1.0
1.0.2
1.0.1
1.0.0
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size intervaltree-3.0.2.tar.gz (30.7 kB) | File type Source | Python version None | Upload date | Hashes |
Hashes for intervaltree-3.0.2.tar.gz
Algorithm | Hash digest |
---|---|
SHA256 | cb4f61c81dcb4fea6c09903f3599015a83c9bdad1f0bbd232495e6681e19e273 |
MD5 | 4159801f23d2d9eb98db6c60f9c3b665 |
BLAKE2-256 | e8f976237755b2020cd74549e98667210b2dd54d3fb17c6f4a62631e61d31225 |
In computer science, the Bx tree is basically a query that is used to update efficient B+ tree-based index structures for moving objects.
Index structure[edit]
The base structure of the Bx-tree is a B+ tree in which the internal nodes serve as a directory, each containing a pointer to its right sibling. In the earlier version of the Bx-tree,[1] the leaf nodes contained the moving-object locations being indexed and corresponding index time. In the optimized version,[2] each leaf node entry contains the id, velocity, single-dimensional mapping value and the latest update time of the object. The fanout is increased by not storing the locations of moving objects, as these can be derived from the mapping values.
Utilizing the B+ tree for moving objects[edit]
As for many other moving objects indexes, a two-dimensional moving object is modeled as a linear function as O = ((x, y), (vx, vy), t ), where (x, y) and (vx, vy) are location and velocity of the object at a given time instance t, i.e., the time of last update. The B+ tree is a structure for indexing single-dimensional data. In order to adopt the B+ tree as a moving object index, the Bx-tree uses a linearization technique which helps to integrate objects' location at time t into single dimensional value. Specifically, objects are first partitioned according to their update time. For objects within the same partition, the Bx-tree stores their locations at a given time which are estimated by linear interpolation. By doing so, the Bx-tree keeps a consistent view of all objects within the same partition without storing the update time of an objects.
Secondly, the space is partitioned by a grid and the location of an object is linearized within the partitions according to a space-filling curve, e.g., the Peano or Hilbert curves.
Finally, with the combination of the partition number (time information) and the linear order (location information), an object is indexed in Bx-tree with a one-dimensional index key Bxvalue: Aiseesoft dvd creator 5.2.8.
Here index-partition is an index partition determined by the update time and xrep is the space-filling curve value of the object position at the indexed time, denotes the binary value of x, and “+” means concatenation.
Given an object O ((7, 2), (-0.1,0.05), 10), tmu = 120, the Bxvalue for O can be computed as follows.
- O is indexed in partition 0 as mentioned. Therefore, indexpartition = (00)2.
- O’s position at the label timestamp of partition 0 is (1,5).
- Using Z-curve with order = 3, the Z-value of O, i.e., xrep is (010011)2.
- Concatenating indexpartition and xrep, Bxvalue (00010011)2=19.
- Example O ((0,6), (0.2, -0.3 ),10) and tmu=120 then O's position at the label timestamp of partition: ???
Insertion, update and deletion[edit]
Given a new object, its index key is computed and then the object is inserted into the Bx-tree as in the B+ tree. An update consists of a deletion followed by an insertion. An auxiliary structure is employed to keep the latest key of each index so that an object can be deleted by searching for the key. The indexing key is computed before affecting the tree. In this way, the Bx-tree directly inherits the good properties of the B+ tree, and achieves efficient update performance.
Queries[edit]
Range query[edit]
A range query retrieves all objects whose location falls within the rectangular range at time not prior to the current time.
The Bx-tree uses query-window enlargement technique to answer queries. Since the Bx-tree stores an object's location as of sometime after its update time, the enlargement involves two cases: a location must either be brought back to an earlier time or forward to a later time. The main idea is to enlarge the query window so that it encloses all objects whose positions are not within query window at its label timestamp but will enter the query window at the query timestamp.
After the enlargement, the partitions of the Bx-tree need to be traversed to find objects falling in the enlarged query window. In each partition, the use of a space-filling curve means that a range query in the native, two-dimensional space becomes a set of range queries in the transformed, one-dimensional space.[1]
To avoid excessively large query region after expansion in skewed datasets, an optimization of the query algorithm exists,[3] which improves the query efficiency by avoiding unnecessary query enlargement.
K nearest neighbor query[edit]
K nearest neighbor query is computed by iteratively performing range queries with an incrementally enlarged search region until k answers are obtained. Another possibility is to employ similar querying ideas in The iDistance Technique.
Other queries[edit]
The range query and K Nearest Neighbor query algorithms can be easily extended to support interval queries, continuous queries, etc.[2]
Adapting relational database engines to accommodate moving objects[edit]
Tree 20 Million
Since the Bx-tree is an index built on top of a B+ tree index, all operations in the Bx-tree, including the insertion, deletion and search, are the same as those in the B+ tree. There is no need to change the implementations of these operations. The only difference is to implement the procedure of deriving the indexing key as a stored procedure in an existing DBMS. Therefore, the Bx-tree can be easily integrated into existing DBMS without touching the kernel.
SpADE[4] is moving object management system built on top of a popular relational database system MySQL, which uses the Bx-tree for indexing the objects. In the implementation, moving object data is transformed and stored directly on MySQL, and queries are transformed into standard SQL statements which are efficiently processed in the relational engine. Most importantly, all these are achieved neatly and independently without infiltrating into the MySQL core.
Performance tuning[edit]
Potential problem with data skew[edit]
The Bx tree uses a grid for space partitioning while mapping two-dimensional location into one-dimensional key. This may introduce performance degradation to both query and update operations while dealing with skewed data. If grid cell is oversize, many objects are contained in a cell. Since objects in a cell are indistinguishable to the index, there will be some overflow nodes in the underlying B+ tree. The existing of overflow pages not only destroys the balancing of the tree but also increases the update cost. As for the queries, for the given query region, large cell incurs more false positives and increases the processing time. On the other hand, if the space is partitioned with finer grid, i.e. smaller cells, each cell contains few objects. There is hardly overflow pages so that the update cost is minimized. Fewer false positives are retrieved in a query. However, more cells are needed to be searched. The increase in the number of cells searched also increases the workload of a query.
Index tuning[edit]
The ST2B-tree [5] introduces a self-tuning framework for tuning the performance of the Bx-tree while dealing with data skew in space and data change with time. In order to deal with data skew in space, the ST2B-tree splits the entire space into regions of different object density using a set of reference points. Each region uses an individual grid whose cell size is determined by the object density inside of it.
The Bx-tree have multiple partitions regarding different time intervals. As time elapsed, each partition grows and shrinks alternately. The ST2B-tree utilizes this feature to tune the index online in order to adjust the space partitioning to make itself accommodate to the data changes with time. In particular, as a partition shrinks to empty and starts growing, it chooses a new set of reference points and new grid for each reference point according to the latest data density. The tuning is based on the latest statistics collected during a given period of time, so that the way of space partitioning is supposed to fit the latest data distribution best. By this means, the ST2B-tree is expected to minimize the effect caused by data skew in space and data changes with time.
See also[edit]
Tre 203
References[edit]
- ^ abChristian S. Jensen, Dan Lin, and Beng Chin Ooi. Query and Update Efficient B+tree based Indexing of Moving Objects. In Proceedings of 30th International Conference on Very Large Data Bases (VLDB), pages 768-779, 2004.
- ^ abDan Lin. Indexing and Querying Moving Objects Databases, PhD thesis, National University of Singapore, 2006.
- ^Jensen, C.S., D. Tiesyte, N. Tradisauskas, Robust B+-Tree-Based Indexing of Moving Objects, in Proceedings of the Seventh International Conference on Mobile Data Management, Nara, Japan, 9 pages, May 9–12, 2006.
- ^SpADEArchived 2009-01-02 at the Wayback Machine: A SPatio-temporal Autonomic Database Engine for location-aware services.
- ^Su Chen, Beng Chin Ooi, Kan-Lee. Tan, and Mario A. Nacismento, ST2B-tree: A Self-Tunable Spatio-Temporal B+-tree for Moving ObjectsArchived 2011-06-11 at the Wayback Machine. In Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD), page 29-42, 2008.