The user should be able to modify the array schema (e.g., add/remove attributes, change filters, tile capacity, etc) and time travel over the different versions.
Arrow support (data representations of jagged arrays)
Working on nested arrays is a crucial task in most scientific fields. I think TileDB could perfectly leverage its strengths to support the community working in that field: https://youtu.be/jvt4v2LTGK0?t=1366 Working with Data Management in TileDB and Data Wrangling in awkward-array ( https://github.com/scikit-hep/awkward-1.0 ) or other libraries with arrow support would be extremely beneficial workhorse. Any updates on when Arrow will be supported?
TileDB currently performs only slicing. It should allow other computations, such as filters, group-by queries and joins. This will help high-level application push compute closer to storage.
Support axes labels
TileDB should support attaching axes labels (dataframes in their full generality), so that the user can slice the array based on arbitrary axes label predicates.
Better Streaming Support
TileDB is already very good for storing most information, and I like that the goal is for it to become a 'Universal Storage Engine', but it currently neglects streaming-data applications, such as saving data every second or minute.
Support for deletions
The user should be able to delete any number of cells. Currently this is possible via directly inserting "tombstones", but the deletion logic fall on the higher-level application. TileDB should be able to natively support deletions.
Integration into distributed SQL database
It would be super cool to have a distributed database with co-located processing on top of TileDB. Apache Ignite is such a distributed database engine but it currently only supports table-based data. My dream is a distributed array database that allows: fast selection along dimensions (-> secondary indices) fast distributed joins co-located processing