Release notes for v2.29.5 – Graphistry

See corresponding release blogpost for a deeper dive into several features.

2.29.5 overview
- REST API for ~100X bigger uploads and in multiple formats:
  XLS, CSV, JSON, Parquet, & ORC, with CSV/Parquet/ORC being GPU-accelerated
- RAPIDS 0.13 (from 0.11): Major updates to core + blazingsql
- Continued 2.0 pageload & stability improvements

As items finalize, they will be included in the below release notes. User-critical features & fixes may change the schedule.

Graphistry Versions

Third-Party Versions

Summary

The primary updates are around the new upload API, RAPIDS 0.13 upgrades, improved pageload speed, and (many) bugfixes.

New Features

Core
- Uploads: Up to 100X bigger and faster uploads
- RAPIDS 0.13: Multi-GPU and out-of-memory, perf/fixes
- BlazingSQL 0.13 : More SQL coverage, multi-GPU
- Faster
  - Graph visualization load
  - Histogram creation
API 2.0: Beginning release of 2.0 engine APIs - increased scale, security, & features
- Backwards compatibility: 1.0 APIs will continue to function through 2020
- Authentication: JWT-based
- Uploads: 100X bigger uploads

Docs

Major Fixes & Tweaks

Python API:
- Python color palette values now match official documented version, fixing buggy 2.0 engine behavior where length-3 palettes were not being included
- NodeXL: Pandas engine-related errors addressed by disabling engine-override. Ensure your Pandas environment has a working Excel engine, or provide a Pandas-compatible Excel object.
Visualization
- Up to 100X bigger & faster uploads via new upload API, initially REST-only
  - We have tested up to 100M nodes & edges in < 10s
  - Note that the UI still maxes around 7M nodes/edges, with recommended configuration determined by client hardware, often around 100K-500K nodes and 1-2M edges
- Pageload speedup
  - Faster initial + cached
  - Especially faster when using new the new Upload API
- Label updates: 'Cull isolated nodes' -> 'Hide standalone nodes', and edge inspector shows a directed arrow to match the directed nature of the default edges
- In 'Hide standalone nodes' mode, nodes with only self-edges are now also considered standalone and will be filtered out
- Various issues around filters
2.28.7 Users:
- Less memory consumption by 'forge-etl-python' process
- Fewer cases of the freeze at '15% loading'
- Fewer cases of inability to read datasets after restart

Administration

We are tuning multitenant memory consumption with the initial rough LRU policy
- GPU: max(all active visualizations, 10 most recent visualizations) per worker
- CPU: max(all active visualizations, 40 most recent visualizations) per worker
- Initially primarily impacts forge-etl-python process
Dataset upload limit is set to 1GB by default (becomes active in next release)
- Override in your `custom.env` by setting `UPLOAD_MAX_SIZE=10M` (10 megabytes) or `UPLOAD_MAX_SIZE=10G` (10 gigabytes)

Migration

Python API:
- Code relying on buggy Python color palette values should use the correct values
- NodeXL integration code relying on a Pandas Excel engine override will require Pandas being set to that engine
Much of the visualization load sequence is being rewritten over this and the next 2 releases to be faster, safer, more scalable, and more controlable, so please notify the team of any encountered problematic workloads so we can add them to our test suite