The Seattle Report on Database Research
Meeting was 2018
What Has Changed in the Last Five Years?
- ML
- DNN for image analysis and NLP
- TensorFlow, PyTorch
- FPGA, GPU
- Data Science: cleaning, transformation, statistics, visualization, and ML.
- Data Governance: GDPR necessitates provenance, metadata management
technology, confidential clod computing, ethical and fair use of data...
- Cloud: Usage has grown
- serverless -- on-demand/high elasticity instead of provisioned resources -- is a thing
- convergence towards data lake architecture with on-demand analysis via
elastic compute (thus: separation of compute and storage)
- IoT: has accelerated
- Hardware:
- Accelerators: FPGAs, GPUs, ASCIs
- new SSDs and NVRAM
- Better bandwidth and latency
- in datacenter via new interconnects
- outside via 5G
Research Challenges
Data Science
Instead of OLAP and SQL, Jupyter Notebooks are the new de-facto standard, with
an ecosystem of open-source libraries.
Data wrangling is main challenge, focus should be on end-to-end solutions.
Data provenance (Where is the data from? Is it fresh?) is important.
Data Governance
Key motivator: GDPR
Provenance
fixing privacy via cryptography
Ethical data science: the problem of racist AI and Fake news.
Cloud
Similar to the Beckman Report, even though cloud isn't called "cheaper".
Separation of compute & storage
Multi-tenancy
Minimizing lock-in
DB Engines
Heterogeneous Computation becomes important.
Data lakes and modern data warehousing
Dsitributed Transactions
Better benchmarks are required
Two holy grails:
- reduce impedance mismatch between application development and database query
- find way to make db systems less rigid without sacrificing lots of performance
Community
Focus should be real, user-focused open source systems
Less pessimistic / alarmist than the last two reports, maybe they gave up.
Research problems:
- too few people go on PCs
- reproducibility is mediocre
- "new algo vs baseline" papers overvalued compared to papers describing
innovative systems, applications, experiments and analyses or user studies.