See corresponding release blogpost for a deeper dive into several features.
- 2.29.5 overview
- REST API for ~100X bigger uploads and in multiple formats:
XLS, CSV, JSON, Parquet, & ORC, with CSV/Parquet/ORC being GPU-accelerated - RAPIDS 0.13 (from 0.11): Major updates to core + blazingsql
- Continued 2.0 pageload & stability improvements
- REST API for ~100X bigger uploads and in multiple formats:
As items finalize, they will be included in the below release notes. User-critical features & fixes may change the schedule.
Graphistry Versions
Server | 2.29.5 |
JS React+Vanilla | 3.7.1 |
Python PyGraphistry client | 0.10.6 |
Third-Party Versions
BlazingSQL | 0.13 (from 0.11) |
Caddy | 1.0.3 |
CUDA (In-Docker) | 10.0 |
Docker (CE) |
19.03.2
|
Docker Compose | 1.24.1 |
Elasticsearch node driver | 14.2.2 |
Pandas | 0.24.2 |
Python | 3.7.3 |
Neo4j node driver | 1.7.6 |
RAPIDS | 0.13 (from 0.11) |
Arrow | 0.15.0 |
Splunk node SDK | 1.9.0 |
Summary
The primary updates are around the new upload API, RAPIDS 0.13 upgrades, improved pageload speed, and (many) bugfixes.
New Features
- Core
- Uploads: Up to 100X bigger and faster uploads
- RAPIDS 0.13: Multi-GPU and out-of-memory, perf/fixes
- BlazingSQL 0.13 : More SQL coverage, multi-GPU
- Faster
- Graph visualization load
- Histogram creation
- API 2.0: Beginning release of 2.0 engine APIs - increased scale, security, & features
- Backwards compatibility: 1.0 APIs will continue to function through 2020
- Authentication: JWT-based
- Uploads: 100X bigger uploads
Docs
- API 2.0 preview for uploads: gist of using the REST API via Python
- Launch of RAPIDS Academy and associated guides
Major Fixes & Tweaks
- Python API:
- Python color palette values now match official documented version, fixing buggy 2.0 engine behavior where length-3 palettes were not being included
- NodeXL: Pandas engine-related errors addressed by disabling engine-override. Ensure your Pandas environment has a working Excel engine, or provide a Pandas-compatible Excel object.
- Visualization
- Up to 100X bigger & faster uploads via new upload API, initially REST-only
- We have tested up to 100M nodes & edges in < 10s
- Note that the UI still maxes around 7M nodes/edges, with recommended configuration determined by client hardware, often around 100K-500K nodes and 1-2M edges
- Pageload speedup
- Faster initial + cached
- Especially faster when using new the new Upload API
- Label updates: 'Cull isolated nodes' -> 'Hide standalone nodes', and edge inspector shows a directed arrow to match the directed nature of the default edges
- In 'Hide standalone nodes' mode, nodes with only self-edges are now also considered standalone and will be filtered out
- Various issues around filters
- Up to 100X bigger & faster uploads via new upload API, initially REST-only
- 2.28.7 Users:
- Less memory consumption by 'forge-etl-python' process
- Fewer cases of the freeze at '15% loading'
- Fewer cases of inability to read datasets after restart
Administration
- We are tuning multitenant memory consumption with the initial rough LRU policy
- GPU: max(all active visualizations, 10 most recent visualizations) per worker
- CPU: max(all active visualizations, 40 most recent visualizations) per worker
- Initially primarily impacts forge-etl-python process
- Dataset upload limit is set to 1GB by default (becomes active in next release)
- Override in your `custom.env` by setting `UPLOAD_MAX_SIZE=10M` (10 megabytes) or `UPLOAD_MAX_SIZE=10G` (10 gigabytes)
Migration
- Python API:
- Code relying on buggy Python color palette values should use the correct values
- NodeXL integration code relying on a Pandas Excel engine override will require Pandas being set to that engine
- Much of the visualization load sequence is being rewritten over this and the next 2 releases to be faster, safer, more scalable, and more controlable, so please notify the team of any encountered problematic workloads so we can add them to our test suite
Comments
0 comments
Article is closed for comments.